-
Notifications
You must be signed in to change notification settings - Fork 16
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 9c091e6
Showing
50 changed files
with
2,397 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
# Created by .ignore support plugin (hsz.mobi) | ||
### Python template | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
|
||
# C extensions | ||
*.so | ||
|
||
# Distribution / packaging | ||
.Python | ||
env/ | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.coverage | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
target/ | ||
|
||
HtmlToWord.old |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# file GENERATED by distutils, do NOT edit | ||
setup.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
WordInserter | ||
=== | ||
This module allows you to insert HTML or MarkDown into a Word Document, as well as allowing you to programmatically build | ||
word documents in pure Python. The API is really simple to use: | ||
|
||
``` python | ||
from wordinserter import parse, render | ||
|
||
operations = parse(html, parser="html") # or parser="markdown" | ||
insert(operations, document=document, constants=constants) | ||
``` | ||
|
||
Inserting HTML or Markdown into a Word document is a two step process: first the input has to be parsed into a sequence | ||
of operations, which is then *rendered* into a Word document. This library currently only supports inserting using the | ||
Word COM interface which means it is Windows specific at the moment. | ||
|
||
Below is a more complex example including starting word that will insert a representation of the HTML code | ||
into the new word document, including the image, caption and list. | ||
|
||
``` python | ||
from wordinserter import render, parse | ||
from comtypes.client import CreateObject | ||
|
||
# This opens Microsoft Word and creates a new document. | ||
word = CreateObject("Word.Application") | ||
word.Visible = True # Don't set this to True in production! | ||
document = word.Documents.Add() | ||
from comtypes.gen import Word as constants | ||
|
||
html = """ | ||
<h3>This is a title</h3> | ||
<p><img src="http://placehold.it/150x150" alt="I go below the image as a caption"></p> | ||
<p><i>This is <b>some</b> text</i> in a <a href="http://google.com">paragraph</a></p> | ||
<ul> | ||
<li>Boo! I am a <b>list</b></li> | ||
</ul> | ||
""" | ||
|
||
# Parse the HTML into a list of operations then feed them into render. | ||
operations = parse(html, parser="html") | ||
render(operations, document=document, constants=constants) | ||
``` | ||
|
||
What's with the constants part? Wordinserter is agnostic to the COM library you use. Each library exposes constant | ||
values that are needed by Wordinserter in a different way: the pywin32 library exposes it as win32com.client.constants | ||
whereas the comtypes library exposes them as a module that resides in comtypes.gen. Rather than guess which one you | ||
are using Wordinserter requires you to pass the right one in explicitly. | ||
|
||
|
||
### Install | ||
Get it [from PyPi here](https://pypi.python.org/pypi/wordinserter). This has been built with word 2010 and 2013, older | ||
versions may produce different results. | ||
|
||
|
||
## Supported Operations | ||
WordInserter currently supports a range of different operations, including code blocks, font size/colors, images, | ||
hyperlinks, numbered and bullet lists ( |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<p>(<a href="http://google.com"><strong>bold</strong></a>) This should not be bold <strong>But this should be</strong></p> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
<p>Some Text</p> | ||
|
||
<p> | ||
TEXT YO<br/> | ||
More break<br/> | ||
This should not have a break in it<br/> | ||
:) | ||
</p> | ||
|
||
<ul> | ||
<li>83.3.136.74<br> | ||
</li> | ||
<li>1.100.136.75<br> | ||
</li> | ||
<li>83.2.1.76</li> | ||
</ul> | ||
<ul> | ||
<li>www.paystobeprescot.co.uk</li> | ||
</ul> | ||
|
||
<p>sup doods</p> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
<h1>Heading 1</h1> | ||
<h2>Heading 2</h2> | ||
<h3>Heading 3</h3> | ||
<h4>Heading 4</h4> | ||
|
||
<p>This is a test document showing what HtmlToWord can do. I hope this doesn't break.</p> | ||
|
||
<p><b>Bold Text.</b> <i>Italic Text.</i> <b><i>Mix</i>ed</b><i> <b>St</b>yles</i><b><i>!</i></b></p> | ||
|
||
<ul> | ||
<li style="font-size: 20px;">ul tags are nested within li tags in this example</li> | ||
<ul> | ||
<li>I'm a child of the previous li</li> | ||
</ul> | ||
</ul> | ||
|
||
<ul> | ||
<li><strong>Bullet lists</strong></li> | ||
<ul> | ||
<li>With Indents</li> | ||
<ul> | ||
<li>Lots of <strong>Indents</strong></li> | ||
</ul> | ||
<li>And back</li> | ||
</ul> | ||
</ul> | ||
|
||
<div> | ||
<ol> | ||
<li>Ordered Lists</li> | ||
<ol> | ||
<li>With indents</li> | ||
<ol> | ||
<li>Ad more indents</li> | ||
</ol> | ||
</ol> | ||
<li>Test2</li> | ||
</ol> | ||
<div> | ||
<img src="https://www.google.co.uk/images/srpr/logo3w.png" | ||
style="cursor: default; float: none; margin: 0px; " | ||
alt="Images! This is the 'alt' attribute"><br> | ||
</div> | ||
</div> | ||
|
||
<table id="table49885" border=1> | ||
<tbody> | ||
<tr> | ||
<td class="">TABLES </td> | ||
|
||
<td><b>with styles</b> </td> | ||
|
||
<td><i><u>and stuff</u></i> </td> | ||
|
||
<td>cool eh? </td> | ||
</tr> | ||
|
||
<tr> | ||
<td> | ||
<ul> | ||
<li>We can have these here <br> | ||
</li> | ||
</ul></td> | ||
|
||
<td class=""> | ||
<ol> | ||
<li> and these<br> | ||
</li> | ||
<ol> | ||
<li>here</li> | ||
</ol> | ||
</ol></td> | ||
|
||
<td class=""> <img src="https://www.google.co.uk/images/srpr/logo3w.png" style="cursor: default; "></td> | ||
|
||
<td class="current"> meh</td> | ||
</tr> | ||
</tbody> | ||
</table> |
Oops, something went wrong.