Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

numind/NuExtract · Hugging Face

Jun 29, 2024 - huggingface.co
NuMind has developed a Structure Extraction Model called NuExtract, a version of phi-3-mini fine-tuned for information extraction. Users can input text and a JSON template to extract specific information from the text. The model is extractive, meaning it only outputs text that is already present in the original input. NuExtract is available in tiny (0.5B) and large (7B) versions. Other models by NuMind include the NuNER Zero, a state-of-the-art zero-shot NER model, and multilingual entity recognition and sentiment analysis models.

To use NuExtract, users need to import the necessary modules and define a function that prepares the input and processes the output. The model and tokenizer are loaded from the pretrained "numind/NuExtract". The input text, schema, and example are defined and passed to the predict_NuExtract function. The function prepares the input, generates the output, and returns the extracted information. The model is recommended to be used with bf16 for negligible performance loss.

Key takeaways:

  • NuExtract is a structure extraction model by NuMind, fine-tuned on a private high-quality synthetic dataset for information extraction.
  • The model is purely extractive, meaning all text output by the model is present as is in the original text.
  • NuMind provides a tiny(0.5B) and large(7B) version of this model: NuExtract-tiny and NuExtract-large.
  • The model can be used by providing an input text (less than 2000 tokens) and a JSON template describing the information you need to extract.
View Full Article

Comments (0)

Be the first to comment!