Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

Ask HN: What is the state of AI data annotation of pdf documents using LLM

Apr 01, 2024 - news.ycombinator.com
The author is seeking advice on how to label segments of a corpus of scientific papers that contain applications to organic chemistry. They are unsure whether they need to train models to detect and label these segments or if they can simply feed the data into a Language Model (LLM) without prior training. They are also interested in finding out the best and most cost-effective services or libraries that can assist with this type of workflow.

The author is not looking for a complex labeling system, but rather a simple one that can identify and label passages that are relevant to organic chemistry. They are looking for a solution that is efficient and does not require them to create a new system from scratch. They are open to suggestions on the most suitable and affordable tools or services that can help them achieve this.

Key takeaways:

  • The user has a corpus of scientific papers and wants to label segments that contain an application to organic chemistry.
  • The labeling process is not very sophisticated and simply identifies if a passage contains an application to organic chemistry.
  • The user is unsure if they need to train models to detect and label these segments or if they can feed the data into an LLM model with no prior training.
  • The user is seeking recommendations for the best and cheapest services or libraries that can assist with this workflow without having to create a new system from scratch.
View Full Article

Comments (0)

Be the first to comment!