Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add a script to automatically generate doc #1648

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions docs/guide/DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,3 +100,25 @@ npm run test
- node 14+
- npm 8+

# Update doc

If [FuryBuilder](https://github.com/apache/incubator-fury/blob/main/java/fury-core/src/main/java/org/apache/fury/config/FuryBuilder.java) is modified (including modifying configuration fields or adding new configuration fields), we need to update the table in [java_serialization_guide.md#furybuilder--options](https://github.com/apache/incubator-fury/blob/main/docs/guide/java_serialization_guide.md#furybuilder--options).

We provide a script to automatically update the table. After modifying [FuryBuilder](https://github.com/apache/incubator-fury/blob/main/java/fury-core/src/main/java/org/apache/fury/config/FuryBuilder.java), we can perform the update operation through the following command.

```bash
python3 tools/gen_fury_builder_doc.py
```

To use this script, we need to add comment on the [FuryBuilder](https://github.com/apache/incubator-fury/blob/main/java/fury-core/src/main/java/org/apache/fury/config/FuryBuilder.java) configuration fields in the following format.

```
/**
* Comment body
*
* @defaultValue xxx
*/
```

The `Comment body` corresponds to the `Description` in the table, and the `@defaultValue` corresponds to the `Default Value` in the table.

3 changes: 2 additions & 1 deletion docs/guide/java_serialization_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ public class Example {
```

## FuryBuilder options

<!-- Auto generate region begin -->
| Option Name | Description | Default Value |
|-------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|
| `timeRefIgnored` | Whether to ignore reference tracking of all time types registered in `TimeSerializers` and subclasses of those types when ref tracking is enabled. If ignored, ref tracking of every time type can be enabled by invoking `Fury#registerSerializer(Class, Serializer)`. For example, `fury.registerSerializer(Date.class, new DateSerializer(fury, true))`. Note that enabling ref tracking should happen before serializer codegen of any types which contain time fields. Otherwise, those fields will still skip ref tracking. | `true` |
Expand All @@ -114,6 +114,7 @@ public class Example {
| `codeGenEnabled` | Disabling may result in faster initial serialization but slower subsequent serializations. | `true` |
| `asyncCompilationEnabled` | If enabled, serialization uses interpreter mode first and switches to JIT serialization after async serializer JIT for a class is finished. | `false` |
| `scalaOptimizationEnabled` | Enables or disables Scala-specific serialization optimization. | `false` |
<!-- Auto generate region end -->

## Advanced Usage

Expand Down
15 changes: 15 additions & 0 deletions java/fury-core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,21 @@
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
<configuration>
<charset>UTF-8</charset>
<encoding>UTF-8</encoding>
<tags>
<tag>
<name>defaultValue</name>
<placement>f</placement>
<head>defaultValue</head>
</tag>
</tags>
</configuration>
</plugin>
</plugins>
</build>
</project>
225 changes: 225 additions & 0 deletions tools/gen_fury_builder_doc.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import os
import re
import dataclasses
import subprocess
import tempfile
import shutil

PROJECT_ROOT_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), "../")

FURY_BUILDER_PATH = os.path.join(
PROJECT_ROOT_DIR,
"java/fury-core/src/main/java/org/apache/fury/config/FuryBuilder.java",
)
JAVA_DOC_PATH = os.path.join(PROJECT_ROOT_DIR, "docs/guide/java_serialization_guide.md")

DOC_GEN_BEGIN = "<!-- Auto generate region begin -->"
DOC_GEN_END = "<!-- Auto generate region end -->"

FIELD_LINE_PATTERN = re.compile(
r"^(public\s+|protected\s+|private\s+|)(\w+|\w+<.+>)\s+\w+$"
)
FILE_REPLACE_PATTERN = re.compile(
rf"^{DOC_GEN_BEGIN}.*{DOC_GEN_END}$", flags=re.MULTILINE | re.S
)


@dataclasses.dataclass
class FieldInfo:
field_scope: str
field_name: str
field_type: str
field_default_val: str
field_comment: str


def _parse_fields(fields_content):
fields_info = []
for field in fields_content:
"""
Field format:
<ul class="blockList">
<li class="blockList">
<h4>language</h4>
<pre><a href="Language.html" title="enum in org.apache.fury.config">Language</a> language</pre>
<div class="block">Whether cross-language serialize the object. If you used fury for java only, please set
language to <a href="Language.html#JAVA"><code>Language.JAVA</code></a>, which will have much better performance.
</div>
<dl>
<dt><span class="simpleTagLabel">defaultValue</span></dt>
<dd>Language.JAVA</dd>
</dl>
</li>
</ul>
"""
tag_labels = field.xpath('li/dl/dt/span[@class="simpleTagLabel"]/text()')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems coupled with javadoc structure. WIll javadoc generated html change between JDK vendor and different JDK versions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we may be able to automatically install the fixed version and vendor in the script before generating javadoc.

Or do you have any other good ideas?

is_config_field = "defaultValue" in tag_labels
if not is_config_field:
continue

field_default_val = "".join(field.xpath("li/dl/dd//text()"))
field_name = field.xpath("li/h4/text()")[0]
field_comment = "".join(field.xpath("li/div//text()")).replace("\n", "")

field_line = "".join(field.xpath("li/pre//text()"))
match = FIELD_LINE_PATTERN.search(field_line)
scope_group = match.group(1).strip()
field_scope = None if len(scope_group) == 0 else scope_group
field_type = match.group(2)
fields_info.append(
FieldInfo(
field_scope, field_name, field_type, field_default_val, field_comment
)
)

return fields_info


def _write_content(fields):
if len(fields) == 0:
return

with open(JAVA_DOC_PATH) as f:
content = f.read()
if content is None:
return

"""
Table format:
| Option Name | Description | Default Value | <------ Table header
|---------------|--------------------|-------------------| <------ Table delimiter
| `xxxxxx` | xxxxxxxx | xxxxx | <------ Table body
"""
hdr1 = " Option Name"
hdr2 = " Description"
hdr3 = " Default Value"
margin_right = 10
hdr1_width = len(hdr1) + margin_right
hdr2_width = len(hdr2) + margin_right
hdr3_width = len(hdr3) + margin_right
for field in fields:
fname = field.field_name
fdesc = (
field.field_comment
if field.field_comment is not None and len(field.field_comment) > 0
else "None"
)
fval = (
field.field_default_val
if field.field_default_val is not None and len(field.field_default_val) > 0
else "None"
)

if len(fname) > hdr1_width:
hdr1_width = len(fname) + margin_right

if len(fdesc) > hdr2_width:
hdr2_width = len(fdesc) + margin_right

if len(fval) > hdr3_width:
hdr3_width = len(fval) + margin_right

table_header = (
f"|{hdr1}{(hdr1_width - len(hdr1)) * ' '}"
f"|{hdr2}{(hdr2_width - len(hdr2)) * ' '}"
f"|{hdr3}{(hdr3_width - len(hdr3)) * ' '}|"
f"\n"
)
table_delimiter = f"|{hdr1_width * '-'}|{hdr2_width * '-'}|{hdr3_width * '-'}|\n"
table_body = ""
for field in fields:
fname = field.field_name
fdesc = (
field.field_comment
if field.field_comment is not None and len(field.field_comment) > 0
else "None"
)
fval = (
field.field_default_val
if field.field_default_val is not None and len(field.field_default_val) > 0
else "None"
)

row = (
f"|`{fname}`{(hdr1_width - len(fname) - 2) * ' '}"
f"| {fdesc}{(hdr2_width - len(fdesc) - 1) * ' '}"
f"| {fval}{(hdr3_width - len(fval) - 1) * ' '}|"
f"\n"
)
table_body += row

table = table_header + table_delimiter + table_body
repl = f"{DOC_GEN_BEGIN}\n{table}\n{DOC_GEN_END}"
to_write = FILE_REPLACE_PATTERN.sub(repl, content)

with open(JAVA_DOC_PATH, "w") as f:
f.write(to_write)


def main():
# 1. Try installing lxml
try_count = 3
while try_count >= 0:
try:
from lxml import etree
except Exception:
if try_count == 0:
raise Exception(f"Retrying {try_count} times to install lxml failed.")
print("Try installing lxml.")
subprocess.check_call("pip3 install lxml", shell=True)
finally:
try_count -= 1

# 2. Generating javadoc
print("Generating javadoc...")
tmp_dir = tempfile.gettempdir()
output_dir = "output"
subprocess.call(
f"cd {PROJECT_ROOT_DIR}java/fury-core;"
f"mvn javadoc:javadoc -DreportOutputDirectory={tmp_dir} -DdestDir={output_dir} -Dshow=private",
shell=True,
stdout=subprocess.DEVNULL,
)
print("javadoc generated successfully.")

# 3. Parsing javadoc
javadoc_dir = os.path.join(tmp_dir, output_dir)
fury_build_src = os.path.join(
javadoc_dir, "org/apache/fury/config/FuryBuilder.html"
)
with open(fury_build_src) as f:
content = f.read()
shutil.rmtree(javadoc_dir)

html = etree.HTML(content)
# There is only one `Field Detail`
field_detail = html.xpath('//div[@class="details"]//section[1]')[0]
if field_detail is None:
raise Exception(
"There is no `Field Detail` related content in the current Javadoc."
)
fields_content = field_detail.xpath("ul/li/ul")
fields_info = _parse_fields(fields_content)
_write_content(fields_info)
print("Doc update completed.")


if __name__ == "__main__":
main()
Loading