package hwp5¶
module hwp5.filestructure
¶
-
class
hwp5.filestructure.
CompressedStorage
(wrapped)¶ decompress streams in the underlying storage
-
class
hwp5.filestructure.
Hwp5Compression
(wrapped)¶ handle compressed streams in HWPv5 files
-
resolve_conversion_for
(name)¶ return a conversion function for the specified storage item
-
-
class
hwp5.filestructure.
Hwp5File
(stg)¶ represents HWPv5 File
Hwp5File(stg)
stg: an instance of Storage
-
resolve_conversion_for
(name)¶ return a conversion function for the specified storage item
-
-
class
hwp5.filestructure.
Hwp5FileBase
(stg)¶ Base of an Hwp5File.
Hwp5FileBase checks basic validity of an HWP format v5 and provides fileheader property.
- Parameters
stg (an instance of storage, OleFileIO or filename) – an OLE2 structured storage.
- Raises
InvalidHwp5FileError – stg is not a valid HWP format v5 document.
-
resolve_conversion_for
(name)¶ return a conversion function for the specified storage item
-
hwp5.filestructure.
is_hwp5file
(filename)¶ Test whether it is an HWP format v5 file.
module hwp5.recordstream
¶
-
class
hwp5.recordstream.
Hwp5File
(stg)¶ Hwp5File for ‘rec’ layer
-
hwp5.recordstream.
group_records_by_toplevel
(records, group_as_list=True)¶ group records by top-level trees and return iterable of the groups
-
hwp5.recordstream.
record_to_json
(record, *args, **kwargs)¶ convert a record to json
module hwp5.binmodel
¶
-
class
hwp5.binmodel.
Hwp5File
(stg)¶
-
class
hwp5.binmodel.
ModelJsonEncoder
(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)¶ -
default
(obj)¶ Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
-
-
hwp5.binmodel.
init_record_parsing_context
(base, record)¶ Initialize a context to parse the given record
the initializations includes followings: - context = dict(base) - context[‘record’] = record - context[‘stream’] = record payload stream
- Parameters
base – the base context to be shallow-copied into the new one
record – to be parsed
- Returns
new context
-
hwp5.binmodel.
model_to_json
(model, *args, **kwargs)¶ convert a model to json
-
hwp5.binmodel.
parse_model
(context, model)¶ HWPTAG로 모델 결정 후 기본 파싱
module hwp5.xmlmodel
¶
-
class
hwp5.xmlmodel.
Hwp5File
(stg)¶
-
hwp5.xmlmodel.
make_extended_controls_inline
(event_prefixed_mac, stack=None)¶ inline extended-controls into paragraph texts
-
hwp5.xmlmodel.
make_paragraphs_children_of_listheader
(event_prefixed_mac, parentmodel=<class 'hwp5.binmodel.tagid56_list_header.ListHeader'>, childmodel=<class 'hwp5.binmodel.tagid50_para_header.Paragraph'>)¶ make paragraphs children of the listheader
-
hwp5.xmlmodel.
make_texts_linesegmented_and_charshaped
(event_prefixed_mac)¶ lineseg/charshaped text chunks
-
hwp5.xmlmodel.
restructure_tablebody
(event_prefixed_mac)¶ Group table columns in each rows and wrap them with TableRow.
-
hwp5.xmlmodel.
tokenize_text_by_lang
(event_prefixed_mac)¶ Group table columns in each rows and wrap them with TableRow.
-
hwp5.xmlmodel.
wrap_section
(event_prefixed_mac, sect_id=None)¶ wrap a section with SectionDef
module hwp5.xmlformat
¶
module hwp5.storage
¶
-
hwp5.storage.
iter_storage_leafs
(stg, basepath='')¶ iterate every leaf nodes in the storage
stg: an instance of Storage
-
hwp5.storage.
unpack
(stg, outbase)¶ unpack a storage into outbase directory
stg: an instance of Storage outbase: path to a directory in filesystem (should not end with ‘/’)
module hwp5.dataio
¶
-
exception
hwp5.dataio.
Eof
(*args)¶
-
exception
hwp5.dataio.
OutOfData
¶
-
exception
hwp5.dataio.
ParseError
(*args, **kwargs)¶
-
hwp5.dataio.
decode_utf16le_with_hypua
(bytes)¶ decode utf-16le encoded bytes with Hanyang-PUA codes into a unicode string with Hangul Jamo codes
- Parameters
bytes – utf-16le encoded bytes with Hanyang-PUA codes
- Returns
a unicode string with Hangul Jamo codes
module hwp5.tagids
¶
module hwp5.importhelper
¶
-
hwp5.importhelper.
pkg_resources_filename
(pkg_name, path)¶ the equivalent of pkg_resources.resource_filename()
-
hwp5.importhelper.
pkg_resources_filename_fallback
(pkg_name, path)¶ a fallback implementation of pkg_resources_filename()
module hwp5.treeop
¶
-
hwp5.treeop.
build_subtree
(event_prefixed_items)¶ build a tree from (event, item) stream
Example Scenario:
... (STARTEVENT, rootitem) # should be consumed by the caller --- call build_subtree() --- (STARTEVENT, child1) # consumed by build_subtree() (STARTEVENT, grandchild) # (same) (ENDEVENT, grandchild) # (same) (ENDEVENT, child1) # (same) (STARTEVENT, child2) # (same) (ENDEVENT, child2) # (same) (ENDEVENT, rootitem) # same, buildsubtree() returns --- build_subtree() returns --- (STARTEVENT, another_root) ...
- result will be (rootitem, [(child1, [(grandchild, [])]),
(child2, [])])
-
hwp5.treeop.
prefix_ancestors
(event_prefixed_items, root_item=None)¶ convert iterable of (event, item) into iterable of (ancestors, item)
-
hwp5.treeop.
prefix_ancestors_from_level
(level_prefixed_items, root_item=None)¶ convert iterable of (level, item) into iterable of (ancestors, item)
@param level_prefixed items: iterable of tuple(level, item) @return iterable of tuple(ancestors, item)
-
hwp5.treeop.
prefix_event
(level_prefixed_items, root_item=None)¶ convert iterable of (level, item) into iterable of (event, item)
-
hwp5.treeop.
tree_events
(rootitem, childs)¶ generate tuples of (event, item) from a tree
-
hwp5.treeop.
tree_events_multi
(trees)¶ generate tuples of (event, item) from trees
module hwp5.utils
¶
-
class
hwp5.utils.
GeneratorReader
(gen)¶ convert a string generator into file-like reader
- def gen():
yield b’hello’ yield b’world’
f = GeneratorReader(gen()) assert ‘hell’ == f.read(4) assert ‘oworld’ == f.read()
-
class
hwp5.utils.
GeneratorTextReader
(gen)¶ convert a string generator into file-like reader
- def gen():
yield ‘hello’ yield ‘world’
f = GeneratorTextReader(gen()) assert ‘hell’ == f.read(4) assert ‘oworld’ == f.read()
-
hwp5.utils.
generate_json_array
(tokens)¶ generate json array with given tokens
-
hwp5.utils.
unicode_escape
(s)¶ Escape a string.
- Parameters
s (unicode) – a string to escape
- Returns
escaped string
- Return type
unicode
-
hwp5.utils.
unicode_unescape
(s)¶ Unescape a string.
- Parameters
s (unicode) – a string to unescape
- Returns
unescaped string
- Return type
unicode