Parsoid 업그레이드 안내

2
낙엽1124 (talkcontribs)

페:업데이트에 적으려고 생각해보니 거기 적을 내용이 아닌 것 같아서 여기 적습니다.

parsoid는 위키문법과 HTML을 변환해주는 서비스인데요, 이전까지 0.5.3 버전을 쓰다가 10월 25일 목요일 오전 2시 0.9.0으로 업그레이드 했습니다.

이전에 쓰던 0.5.3은 현재 페미위키의 미디어위키 버전인 1.31과는 호환이 아슬아슬하게 되고 있던 상태여서 간혹 예기치 않은 문제가 생겼는데요, 이번 업그레이드로 이런 문제들이 해결됐을 것으로 기대합니다.

변경 사항은 다음을 참고해주세요(영문).

이 글을 보려면 오른쪽 '펼치기' 버튼 클릭

0.9.0 / 2018-03-23

Notable wt -> html changes

  • Parsoid HTML version bumped to 1.6.1
  • T114072: Add wrappers to Parsoid output
  • T118520: Use figure-inline instead of span for inline media
  • Update Parsoid to generate modern HTML5 IDs w/ legacy fallback
  • T58756: External links class= now setting free, text and autonumber
  • T45094: Replace with for references
  • T97093: Use mw:WikiLink/Interwiki for interwiki links
  • Permit extension tags in xmlish attribute values
  • A number of bug fixes and crasher fixes

Notable html -> wt changes:

  • Preserve original transclusion's parameter order
  • T180930: Selser shouldn't reuse orig sep for autoinserted tags

Infrastructure:

  • This release requires clients (VE, etc.) to return a 1.6.0 and greater HTML version string in the header. If not, Parsoid will return a HTTP 406. This can be fixed by updating VE (or relevant clients) to a more recent version.
  • T66003: Make strictSSL configurable per wiki as well
  • Use pure compute workers for the request processing
  • T123446: Bring back request timeouts
  • Lots of changes to wikitext linting code including new linter categories.

Extensions

  • Match core's parsing of gallery dimensions
  • Added and extension handling.

Performance fixes:

  • Don't process token attributes unnecessarily
  • T176728: Use replaceChild instead of insertBefore
  • Performance fixes to domino, the html + dom library used in Parsoid

Dependencies:

  • Upgrade eslint, domino, service-runner, request and many other dev and non-dev dependencies

Cleanup:

  • Get rid of the handleUnbalancedTables DOM pass
  • The normalize post processor isn't needed any more
  • More use of arrow functions, promises, async/yield, ES6 classes in the codebase
  • Switch from jsduck to jsdoc3 for documentation and use new jsdoc-wmf-theme for documentation

0.8.0 / 2017-10-24

Notable wt -> html changes:

  • T43716: Parse and serialize language converter markup
  • T64270: Support video and audio content
  • T39902, T149794: Markup red links, disambiguation links in Parsoid HTML
  • T122965: Support HTML5 elements in older browsers
  • T173384: Improve handling of tokens in parser function targets
  • T153885: Handle templated template names
  • T151277: Handle [[Media:Foo.jpg]] syntax correctly
  • Generalize removal of useless p-wrappers
  • More permissive attribute name parsing
    • match PHP parser's attribute sanitizer
  • Remove dependence on native parser functions
  • Stop using usePHPPreProcessor as a proxy for an existing mw api to parse extensions
  • Several bug fixes

Notable html -> wt changes:

  • T135667, T138492: Use improved format specifier for TemplateData enabling templates to control formatting of transclusions after VE edits
  • T153107: Fix unhandled detection of modified link content
  • T136653: Handle interwiki shortcuts
  • T177784: Update reverse interwiki map to prefer language prefixes over others
  • Cleanup in separator handling in the wikitext serializer
  • Several bug fixes

API:

  • Remove support for pb2html in the http api

Extensions:

  • Cite:
    • T159894: Add support for Cite's responsive parameter
  • Gallery:
    • Remove inline styling for vertical alignment in traditional galleries
    • All media should scale in gallery

Dependencies:

  • Upgrade service-runner, mediawiki-title
  • Use uuid instead of node-uuid
  • Upgrade several dependencies to deal with security advisories
  • Limit core-js shimming to what we need

Infrastructure:

  • Migrate from jshint to eslint

Notable wikitext linting changes:

  • Move linter config properties to the linter config object
  • Only lint pages that have wikitext contentmodel
  • Lint multiple colon escaped links (incorrect usage)
  • Add an API endpoint to get lint errors for wikitext
  • Turn off ignored-table-attr output
  • Add detection for several wikitext patterns that render differently in Tidy compared to a HTML5 based parser (Parsoid, RemexHTML). This is only relevant if you want to fix pages before replacing Tidy or if you want to use Parsoid HTML for non-edit purposes.

Other:

  • Add code of conduct file to the repo

0.7.1 / 2017-04-05

No changes. New release to update nodejs dependency in the deb package.

0.7.0 / 2017-04-04

wt -> html changes:

  • T102209: Assign ids to H[1-6] tags that match PHP parser's assignment
  • T150112: Munge link fragments and element ids as in the php parser
  • T59603: T133267: Escape extlink content when containing ] anywhere
  • T156296: Update cached wiki configs for several wikimedia wikis
  • T50900: Improved error output for extensions, missing images
  • T109897: Remove implicit_table_data_tag rule
  • T98960: Accept entities in extlink href and url links
  • T113044: Complete templatearg representation in spec
  • T104523: Prevent infinite recursion in template expansion
  • T104662: Allow nested ref tags only in templates
  • Support extension tags which shadows "block level" HTML elements
  • A bunch of cleanup and edge case fixes in the PEG tokenizer
  • Don't accept pipe unconditionally in extlink
  • Percent-encode modules link in the HEAD section
  • Update CSS modules in HEAD section
  • Remove special-case non-void semantics for SOURCE
  • Fixup redirect-detecting regular expressions in multiple places
  • Edge case bug fixes to title handling code
  • Edge case bug fixes in aynsc token transformation pipeline
  • Several fixes to the linting code to support the PHP Linter extension

html -> wt changes:

  • T149209: Handle newlines in TD and TH cells
  • T160207: Fix serializing multi-line indent-pre w/ sol wt syntax
  • T133267: Escape extlink content when containing ] anywhere
  • T152633: Fix crasher from ConstrainedText
  • T112043: Handle anchors without hrefs
  • Fix and cleanup domdiff annotations which fixes some edge case bugs

Extensions:

  • T110910: Implement gallery extension natively inside Parsoid
  • T58381, T108216: Treat NOWIKI and html PRE as extension tags
  • Cite: T102134: Fix hrefs to render properly
  • Cite: Escape cite ids with Sanitizer.escapeId
  • Move section handling to the LST extension
  • Extension API improvements for the ProofreadPage extension
  • Normalize all extension options

Infrastructure changes:

  • Update parser tests syncing scripts to let us sync PHP extension tests from to/from Parsoid.
  • Several fixes to parserTests scripts to improve output and processing of test options, among other things.
  • Bump domino, service-runner, minor versions of some deps, and some dev deps.
  • Switch to npm@3

API changes:

  • In dev-api mode, add ?follow_redirects=true support to wt2html API end points to get Parsoid to return a HTTP 302 response for redirect pages. This lets 302-following clients to render the target page.

Other fixes:

  • T153797: ApiRequest: Clone the request options before modifying them
  • T150213: Suppress logs for known unknown contentmodels
  • Code cleanup and refactoring for upcoming audio/video support.
  • Code cleanup and refactoring in template handling for upcoming support for templated template names. This also fixes some edge case bugs.

0.6.1 / 2016-11-14

  • Fix broken 0.6.0 debian package

0.6.0 / 2016-11-07

wt -> html changes:

  • T147742: Trim template target after stripping comments
  • T142617: Handle invalid titles in transclusions
  • Handle caption-like text outside tables
  • migrateTrailingNLs DOM pass: Code simplifications and some subtle edge case bug fixes
  • Handle HTML tags in attribute text properly
  • A bunch of cleanup and fixes in the PEG tokenizer

html -> wt changes:

  • T134389: Serialize content in HTML tables using HTML tags
  • T125419: Fix selser issues serializing first table row
  • T137406: Emit |- between thead/tbody/tfoot
  • T139388: Ensure that edits to content nested in elements with templated attributes is not lost by the selective serializer.
  • T142998: Fix crasher in DOM normalization code
  • Normalize all lists to not mix wikitext and HTML list syntax
  • Always emit canonical wikitext for url links
  • Emit url-links where appropriate no matter what rel attribute says

Infrastructure changes:

  • T96195 : Remove node 0.8 support
  • T113322: Use the mediawiki-title library instead of Parsoid-homegrown title normalization code.
  • Remove html5 treebuilder in favour of domino's
  • service-runner:
    • T90668 : Replace custom server.js with service-runner
    • T141370: Use service-runner's logger as a backend to Parsoid's logger
    • Use service-runner's metrics reporter in the http api
  • Extensions:
    • T48580, T133320: Allow extensions to handle specific contentmodels
    • Let native extensions add stylesheets
  • Lots of wikitext linter fixes / features.

API changes:

  • T130638: Add data-mw as a separate JSON blob in the pagebundle
  • T135596: Return client error for missing data attributes
  • T114413: Provide HTML2HTML endpoint in Parsoid
  • T100681: Remove deprecated v1/v2 HTTP APIs
  • T143356: Separate data-mw API semantics
  • Add a page/wikitext/:title route to GET wikitext for a page
  • Updates in preparation for supporting version 2.x content in the future -- should be no-op for version 1.x content
  • Don't expose dev routes in production
  • Cleanup http redirects
  • Send error responses in the requested format

Performance fixes:

  • Template wrapping: Eliminate pathological tpl-range nesting scenario
  • computeDSR: Fix source of pathological O(n^2) behavior

Other fixes:

  • Make the http connect timeout configurable
  • Prevent JSON.stringify circular refs in template wrapping trace/error logs
  • Fix processing listeners in node v7.x

감사합니다.

This post was hidden by 낙엽1124 (history)