Skip to content
This repository has been archived by the owner on Oct 20, 2022. It is now read-only.

deepset-ai/haystack-website

Repository files navigation

Development

Getting Started

First, install the dependencies (If you are running into issues with this, make sure to update Node to the latest version):

yarn install

Part of the documentation source lives within the Haystack repo and the build system expects to find it locally, so before running the development server run this command to get a local copy of Haystack:

yarn haystack

At this point you can run the development server:

yarn dev

Open http://localhost:3000 with your browser to see the result.

When editing .mdx files, you can run the following command to see your changes update automatically:

yarn dev:watch

Note: This setup is tested with node v14.17.5 - but might be incompatible to older/newer versions.

Environment Variables

If you have permission issues when starting up, get a personal access token from GitHub. The public_repo scope is sufficient.

Create a .env.local file and add your token as an env variable:

GITHUB_PERSONAL_ACCESS_TOKEN="youraccesstoken"

Required Reading

This project makes heavy use of Next.js's getStaticProps and getStaticPaths functions, to fetch markdown files at build time (locally from the docs directory as well as from GitHub using the GitHub API) and generate html pages for each of these files. Before working on the project, it's vital that you understand how these functions work and how they apply to this project. This example and this example may be used as simple demonstrations of these functions to solidify your understanding.

Docs Publishing Process

Overview & Usage Docs

These docs live in the docs directory, in the given version directory. The docs are written in .mdx, which allows us to include JSX inside these files. This allows us to add Headless UI components, a React component library based on Tailwind.css. See the components/Disclosures and components/Tabs components as examples and how these are used inside of .mdx files such as docs/v0.9.0/overview/get-started.mdx. Whenever you want edit or create new documentation, simply do so by adding .mdx files to a given version directory or by editing existing .mdx files. For new files one additional step is required, please add the new page to the menu.json file which is located in the folder docs/vX.X.X. In the same way, please remove a page from menu.json if it is not needed anymore, e.g., if the corresponding module has been deleted in haystack and therefore its documentation is not needed anymore. When you push a branch with your changes to GitHub, Vercel will automatically generate a preview environment for you (check the Vercel Dashboard to find the preview URL).

Tutorial & Reference Docs

These docs live in the Haystack repository, in the given version directory. The docs are generated markdown files and must be fetched before the build starts. Thanks to Vercel's Incremental Static Regeneration, the static pages we create for these docs are always up-to-date. This means that if existing tutorials or references are changed, the changes will be visible on the docs website automatically.

Adding a new Tutorial Page

In the Haystack repo, add an entry into haystack/docs/_src/tutorials/tutorials/headers.py that corresponds to your new tutorial. When you push your changes to any branch, there is a Github action that calls haystack/docs/_src/tutorials/tutorials/convert_ipynb.py to generate a .md version of the tutorial in the same folder. These .md files are generally called something like 12.md.

Then in this repo, you need to add an entry to haystack-website/lib/constants.ts to refer to the new .md file in Haystack. Please add the new file only to the latest version. If you remove files, you also have to remove it in the latest version. To make it appear in the left Table of Contents, you need to add a new entry to haystack-website/docs/latest/menu.json.

For example:

const res = await octokit.rest.repos.getContent({
  owner: "deepset-ai",
  repo: "haystack",
  path: `docs${version && version !== "latest" ? `/${version}` : ""}${repoPath}${filename}`,
  ref: HAYSTACK_BRANCH_NAME
});

Preview from non-main branches

To preview docs that are on a non-main branch of the Haystack repo, you run this project locally and navigate to lib/github.ts, where you have to add a ref parameter to the octokit.rest.repos.getContent function call with the value of the branch name that you would like to preview. You also need to add the tutorials/references you would like to preview to docs/{GIVEN_VERSION}/menu.json and lib/constants.ts.

Redirects In Case of Renaming or Restructuring Pages

When renaming documentation pages, or restructuring the directories that they're contained in, the new filepath can cause old links to break. For example, when the pipeline_nodes grouping was created components/reader.mdx did not exist any more as it had changed to pipeline_nodes/reader.mdx. This meant that links on websites were broken.

To make sure links aren't broken please follow these steps:

  1. Identify what path is no longer valid and what new path is the most appropriate for it to point to

  2. Populate the redirects() function in next.config.js with an entry containing source, destination and permanent:

    {
      source: 'the/old/path',
      destination: '/the/new/path',
      permanent: true,
    }
    

    The haystack-website/docs/generate_redirect_table.py script will generate a set of suggested mappings. In cases where the directory structure has changed but the filename has stayed the same, this script will map from the old link to the new link in latest. In cases where the filename has changed, this script will identify the old link but not provide a suggestion for a new link. Update the MANUAL_REDIRECTS option to define any custom destinations.

  3. Push the changes to your branch and test that the old paths still work and point to the intended destination. You can do this by checking out the Preview that Vercel will produce.

Updating docs after a release

When there's a new Haystack release, we need to create a directory for the new version within the local /docs directory. In this directory, we can write new overview and usage docs in .mdx (or manually copy over the ones from the previous version directory). Once this is done, the project will automatically fetch the reference and tutorial docs for the new version from GitHub. Bear in mind that a menu.json file needs to exist in every new version directory so that our Menu components know which page links to display.

Moreover, we need to point the links, which are pointing to the latest version, to the new version. Update links in docs using haystack-website/docs/update_links.py. The command you run should look something like python update_links.py -d v0.3.0 -v v0.3.0. This script prints the changes to console. Have a scan through these as a sanity check.

Additionally, the referenceFiles and tutorialFiles constants in lib/constants need to be updated with any new reference or tutorial docs that get created as part of a new release. During a release, please add a new object referenceFiles and tutorialFiles with the release number to file. This change has also implications on the files tutorials/[...slug].tsx and reference/[...slug].tsx. Please update the functions getStaticPaths and getStaticProps in both files with an array representing the latest version.

In the haystack repo, we have to release the api and tutorial docs by copying them to a new version folder as well. If you want to include here files from another branch than main follow Preview from non-main branches. Lastly, we have to update the constant specified in the components/VersionSelect component, so that we default to the new version when navigating between pages.

After releasing the docs, we need to release the benchmarks. Create a new version folder in the folder benchmarks and copy all folders from latest to the new folder.

If you now start the local sever and go to the new version, you will see the 404 page. We pull the version from the haystack release tags. Most likely, the newest version is not released yet. Therefore, you have to add it manually to the array tagNames in the function getDocsVersions by adding the command tagNames.push('v0.10.0');.

Styling

We use Tailwind for CSS. It's a CSS utility library, which allows us to write barely any CSS ourselves. The tailwind.config.js file contains configuration to provide classes that match deepset.ai's new style guide. Additionally, there is a styles/global.css file, which loads our custom font provided by the style guide. Lastly, we have two css module files within the components directory (markdown.module.css and tutorial.module.css), wich are applied on the components/Layout component. These files allow us to provide some defaults for certain HTML elements, which get applied to the HTML tags generated when we convert markdown to html at build time. We also use a React component library authored by the Tailwind team, called Headless UI. This allows us to easily create React components such as the components/Tabs and components/Disclosures components.

Deployment

This application gets deployed on Vercel. In the dashboard, connect the haystack-website repo to a new project and it should handle builds, preview environments (all branches other than main), and production environments (main branch) automatically. Be sure to include yarn haystack in the list of build commands.

Future Work

Convert the remote markdown files for references and tutorials to .mdx, so that we can inject React components into these. This would also allow for more code sharing between the overview+usage pages and tutorial+reference pages.