package hwp5

module hwp5.filestructure

class hwp5.filestructure.CompressedStorage(wrapped)

decompress streams in the underlying storage

class hwp5.filestructure.Hwp5Compression(wrapped)

handle compressed streams in HWPv5 files

class hwp5.filestructure.Hwp5File(stg)

represents HWPv5 File

Hwp5File(stg)

stg: an instance of Storage

class hwp5.filestructure.Hwp5FileBase(stg)

Base of an Hwp5File.

Hwp5FileBase checks basic validity of an HWP format v5 and provides fileheader property.

매개 변수:stg (an instance of storage, OleFileIO or filename) -- an OLE2 structured storage.
예외:InvalidHwp5FileError -- stg is not a valid HWP format v5 document.
hwp5.filestructure.is_hwp5file(filename)

Test whether it is an HWP format v5 file.

module hwp5.recordstream

class hwp5.recordstream.Hwp5File(stg)

Hwp5File for 'rec' layer

hwp5.recordstream.group_records_by_toplevel(records, group_as_list=True)

group records by top-level trees and return iterable of the groups

hwp5.recordstream.record_to_json(record, *args, **kwargs)

convert a record to json

module hwp5.binmodel

class hwp5.binmodel.ModelJsonEncoder(skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, encoding='utf-8', default=None)
default(obj)

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
hwp5.binmodel.init_record_parsing_context(base, record)

Initialize a context to parse the given record

the initializations includes followings: - context = dict(base) - context['record'] = record - context['stream'] = record payload stream

매개 변수:
  • base -- the base context to be shallow-copied into the new one
  • record -- to be parsed
반환:

new context

hwp5.binmodel.model_to_json(model, *args, **kwargs)

convert a model to json

hwp5.binmodel.parse_model(context, model)

HWPTAG로 모델 결정 후 기본 파싱

module hwp5.xmlmodel

hwp5.xmlmodel.make_extended_controls_inline(event_prefixed_mac, stack=None)

inline extended-controls into paragraph texts

hwp5.xmlmodel.make_paragraphs_children_of_listheader(event_prefixed_mac, parentmodel=<class 'hwp5.binmodel.tagid56_list_header.ListHeader'>, childmodel=<class 'hwp5.binmodel.tagid50_para_header.Paragraph'>)

make paragraphs children of the listheader

hwp5.xmlmodel.make_texts_linesegmented_and_charshaped(event_prefixed_mac)

lineseg/charshaped text chunks

hwp5.xmlmodel.restructure_tablebody(event_prefixed_mac)

Group table columns in each rows and wrap them with TableRow.

hwp5.xmlmodel.tokenize_text_by_lang(event_prefixed_mac)

Group table columns in each rows and wrap them with TableRow.

hwp5.xmlmodel.wrap_section(event_prefixed_mac, sect_id=None)

wrap a section with SectionDef

module hwp5.storage

hwp5.storage.iter_storage_leafs(stg, basepath=u'')

iterate every leaf nodes in the storage

stg: an instance of Storage

hwp5.storage.unpack(stg, outbase)

unpack a storage into outbase directory

stg: an instance of Storage outbase: path to a directory in filesystem (should not end with '/')

module hwp5.dataio

class hwp5.dataio.BSTR
basetype

alias of __builtin__.unicode

class hwp5.dataio.BYTE
basetype

alias of __builtin__.int

class hwp5.dataio.DOUBLE
basetype

alias of __builtin__.float

exception hwp5.dataio.Eof(*args)
class hwp5.dataio.HWPUNIT
basetype

alias of __builtin__.long

class hwp5.dataio.HWPUNIT16
basetype

alias of __builtin__.int

class hwp5.dataio.INT16
basetype

alias of __builtin__.int

class hwp5.dataio.INT32
basetype

alias of __builtin__.int

class hwp5.dataio.INT8
basetype

alias of __builtin__.int

exception hwp5.dataio.OutOfData
exception hwp5.dataio.ParseError(*args, **kwargs)
class hwp5.dataio.SHWPUNIT
basetype

alias of __builtin__.int

class hwp5.dataio.UINT16
basetype

alias of __builtin__.int

class hwp5.dataio.UINT32
basetype

alias of __builtin__.long

class hwp5.dataio.UINT8
basetype

alias of __builtin__.int

class hwp5.dataio.WCHAR
basetype

alias of __builtin__.int

class hwp5.dataio.WORD
basetype

alias of __builtin__.int

hwp5.dataio.decode_utf16le_with_hypua(bytes)

decode utf-16le encoded bytes with Hanyang-PUA codes into a unicode string with Hangul Jamo codes

매개 변수:bytes -- utf-16le encoded bytes with Hanyang-PUA codes
반환:a unicode string with Hangul Jamo codes

module hwp5.tagids

module hwp5.plat

module hwp5.importhelper

hwp5.importhelper.pkg_resources_filename(pkg_name, path)

the equivalent of pkg_resources.resource_filename()

hwp5.importhelper.pkg_resources_filename_fallback(pkg_name, path)

a fallback implementation of pkg_resources_filename()

module hwp5.treeop

hwp5.treeop.build_subtree(event_prefixed_items)

build a tree from (event, item) stream

Example Scenario:

...
(STARTEVENT, rootitem)          # should be consumed by the caller
--- call build_subtree() ---
(STARTEVENT, child1)            # consumed by build_subtree()
(STARTEVENT, grandchild)        # (same)
(ENDEVENT, grandchild)          # (same)
(ENDEVENT, child1)              # (same)
(STARTEVENT, child2)            # (same)
(ENDEVENT, child2)              # (same)
(ENDEVENT, rootitem)            # same, buildsubtree() returns
--- build_subtree() returns ---
(STARTEVENT, another_root)
...
result will be (rootitem, [(child1, [(grandchild, [])]),
(child2, [])])
hwp5.treeop.prefix_ancestors(event_prefixed_items, root_item=None)

convert iterable of (event, item) into iterable of (ancestors, item)

hwp5.treeop.prefix_ancestors_from_level(level_prefixed_items, root_item=None)

convert iterable of (level, item) into iterable of (ancestors, item)

@param level_prefixed items: iterable of tuple(level, item) @return iterable of tuple(ancestors, item)

hwp5.treeop.prefix_event(level_prefixed_items, root_item=None)

convert iterable of (level, item) into iterable of (event, item)

hwp5.treeop.tree_events(rootitem, childs)

generate tuples of (event, item) from a tree

hwp5.treeop.tree_events_multi(trees)

generate tuples of (event, item) from trees

module hwp5.utils

class hwp5.utils.GeneratorReader(gen)

convert a string generator into file-like reader

def gen():
yield b'hello' yield b'world'

f = GeneratorReader(gen()) assert 'hell' == f.read(4) assert 'oworld' == f.read()

class hwp5.utils.GeneratorTextReader(gen)

convert a string generator into file-like reader

def gen():
yield 'hello' yield 'world'

f = GeneratorTextReader(gen()) assert 'hell' == f.read(4) assert 'oworld' == f.read()

hwp5.utils.generate_json_array(tokens)

generate json array with given tokens

hwp5.utils.unicode_escape(s)

Escape a string.

매개 변수:s (unicode) -- a string to escape
반환:escaped string
반환 형식:unicode
hwp5.utils.unicode_unescape(s)

Unescape a string.

매개 변수:s (unicode) -- a string to unescape
반환:unescaped string
반환 형식:unicode