Summarisation with structured data #26641
Replies: 2 comments 1 reply
-
To build a model that can generate summaries of documents based on filters such as
Here is a sample code snippet to illustrate the process: from langchain import LangChain
from langchain.document_loaders import DocumentLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
# Load documents
documents = [
{
"content": "Effective: March 2020\nPurpose\n\nThe purpose of this full-time work-from-home policy is to provide guidelines and support for employees to conduct their work remotely, ensuring the continuity and productivity of business operations during the COVID-19 pandemic and beyond.\nScope\n\nThis policy applies to all employees who are eligible for remote work as determined by their role and responsibilities. It is designed to allow employees to work from home full time while maintaining the same level of performance and collaboration as they would in the office.\nEligibility\n\nEmployees who can perform their work duties remotely and have received approval from their direct supervisor and the HR department are eligible for this work-from-home arrangement.\nEquipment and Resources\n\nThe necessary equipment and resources will be provided to employees for remote work, including a company-issued laptop, software licenses, and access to secure communication tools. Employees are responsible for maintaining and protecting the company's equipment and data.\nWorkspace\n\nEmployees working from home are responsible for creating a comfortable and safe workspace that is conducive to productivity. This includes ensuring that their home office is ergonomically designed, well-lit, and free from distractions.\nCommunication\n\nEffective communication is vital for successful remote work. Employees are expected to maintain regular communication with their supervisors, colleagues, and team members through email, phone calls, video conferences, and other approved communication tools.\nWork Hours and Availability\n\nEmployees are expected to maintain their regular work hours and be available during normal business hours, unless otherwise agreed upon with their supervisor. Any changes to work hours or availability must be communicated to the employee's supervisor and the HR department.\nPerformance Expectations\n\nEmployees working from home are expected to maintain the same level of performance and productivity as if they were working in the office. Supervisors and team members will collaborate to establish clear expectations and goals for remote work.\nTime Tracking and Overtime\n\nEmployees are required to accurately track their work hours using the company's time tracking system. Non-exempt employees must obtain approval from their supervisor before working overtime.\nConfidentiality and Data Security\n\nEmployees must adhere to the company's confidentiality and data security policies while working from home. This includes safeguarding sensitive information, securing personal devices and internet connections, and reporting any security breaches to the IT department.\nHealth and Well-being\n\nThe company encourages employees to prioritize their health and well-being while working from home. This includes taking regular breaks, maintaining a work-life balance, and seeking support from supervisors and colleagues when needed.\nPolicy Review and Updates\n\nThis work-from-home policy will be reviewed periodically and updated as necessary, taking into account changes in public health guidance, business needs, and employee feedback.\nQuestions and Concerns\n\nEmployees are encouraged to direct any questions or concerns about this policy to their supervisor or the HR department.\n",
"summary": "This policy outlines the guidelines for full-time remote work, including eligibility, equipment and resources, workspace requirements, communication expectations, performance expectations, time tracking and overtime, confidentiality and data security, health and well-being, and policy reviews and updates. Employees are encouraged to direct any questions or concerns",
"name": "Work From Home Policy",
"created_on": "2020-03-01",
"department_name": "HR",
"created_by": "John Doe"
},
# Add more documents here
]
# Initialize LangChain
langchain = LangChain()
# Load documents into LangChain
loader = DocumentLoader(documents)
docs = loader.load()
# Split documents into chunks
splitter = RecursiveCharacterTextSplitter()
chunks = splitter.split(docs)
# Create embeddings
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(chunks, embeddings)
# Define a prompt template for summary generation
prompt_template = PromptTemplate(
input_variables=["document"],
template="Summarize the following document: {document}"
)
# Create a chain for summary generation
summary_chain = LLMChain(
llm=langchain,
prompt=prompt_template
)
# Function to filter documents
def filter_documents(docs, created_by=None, department_name=None, created_on=None):
filtered_docs = []
for doc in docs:
if (created_by and doc['created_by'] != created_by) or \
(department_name and doc['department_name'] != department_name) or \
(created_on and doc['created_on'] != created_on):
continue
filtered_docs.append(doc)
return filtered_docs
# Filter documents based on criteria
filtered_docs = filter_documents(docs, created_by="John Doe", department_name="HR", created_on="2020-03-01")
# Generate summaries for filtered documents
summaries = []
for doc in filtered_docs:
summary = summary_chain.run(document=doc['content'])
summaries.append(summary)
# Output summaries
for summary in summaries:
print(summary) This code demonstrates how to load documents, filter them based on specified criteria, and generate summaries using LangChain. Adjust the filtering criteria and document content as needed [1][2]. |
Beta Was this translation helpful? Give feedback.
-
filtered_docs = filter_documents(docs, created_by="John Doe", department_name="HR", created_on="2020-03-01") |
Beta Was this translation helpful? Give feedback.
-
I have some documents in structured database with some other attributes
I want to build a model that can generate summary of documents based of some filters on other attributes
Eg. Table
Document_ID Document_info date department_name created_by created_on
Question can be: please summarise all the documents created by A in department B that were created on Y
Could you please advice what would be the best way to do it.
Beta Was this translation helpful? Give feedback.
All reactions