nlptagger

command module
v0.0.0-...-fcbc575 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 22, 2024 License: GPL-2.0, GPL-3.0 Imports: 12 Imported by: 0

README

nlptagger

When you need a program to understand context of commands.

GitHub repo file count GitHub code size in bytes GitHub repo size GitHub GitHub commit activity status beta Website shields.ioPRs WelcomeMaintenanceGitHub go.mod Go version of a Go moduleGoDoc reference exampleGoReportCard examplesaythanks

General info

*This project is used for tagging cli commands. It is not a LLM or trying to be. I am using it to generate go code but I made this completely separate so others can enjoy it. *I will keep working on it and hopefully improving the phrase tagging and hopefully adding neural networks in the future.

-Background

  1. Tokenization: This is the very first step in most NLP pipelines. It involves breaking down text into individual units called tokens (words, punctuation marks, etc.). Tokenization is fundamental because it creates the building blocks for further analysis.

  2. Part-of-Speech (POS) Tagging: POS tagging assigns grammatical categories (noun, verb, adjective, etc.) to each token. It's a crucial step for understanding sentence structure and is often used as input for more complex tasks like phrase tagging.

  3. Named Entity Recognition (NER): NER identifies and classifies named entities (people, organizations, locations, dates, etc.) in text. This is more specific than POS tagging but still more generic than phrase tagging, as it focuses on individual entities rather than complete phrases.

  4. Dependency Parsing: Dependency parsing analyzes the grammatical relationships between words in a sentence, creating a tree-like structure that shows how words depend on each other. It provides a deeper understanding of sentence structure than phrase tagging, which focuses on contiguous chunks.

  5. Lemmatization and Stemming: These techniques reduce words to their base or root forms (e.g., "running" to "run"). They help to normalize text and improve the accuracy of other NLP tasks.

*Phrase tagging often uses the output of these more generic techniques as input. For example:

POS tags are commonly used to define rules for identifying phrases (e.g., a noun phrase might be defined as a sequence of words starting with a determiner followed by one or more adjectives and a noun). NER can be used to identify specific types of phrases (e.g., a phrase tagged as "PERSON" might indicate a person's name).

Why build this?

  • Go never changes
  • It is nice to not have terminal drop downs

What does it do?

  • It tags words for commands. *I made an overview video on this project. video

Technologies

*Just Go.

Requirements

  • go 1.23 for gonew

How to run as is?

package main

import (
	"fmt"
	"strings"

	modeldata "github.com/golangast/nlptagger/nn"
	"github.com/golangast/nlptagger/tagger/tag"
)

func main() {

	//you have to create a trainig file
	md, err := modeldata.ModelData("data/training_data.json")
	if err != nil {
		fmt.Println("Error loading or training model:", err)
	}
	// Example prediction
	sentence := "generate a webserver with the handler dog with the data structure people"
	//making prediction
	predictedPosTags, predictedNerTags, predictedPhraseTags, predictedDRTags := md.PredictTags(sentence)
	//getting tags

	predictedTagStruct := tag.Tag{
		PosTag:          predictedPosTags, // Assign the predicted POS tags to the PosTag field
		NerTag:          predictedNerTags,
		PhraseTag:       predictedPhraseTags,
		DepRelationsTag: predictedDRTags,
	}

	// Print the sentence again for clarity
	fmt.Println("Sentence:", sentence)
	// Print the predicted POS tags in a space-separated format
	fmt.Println("Predicted POS Tag Types:", strings.Join(predictedTagStruct.PosTag, " "))
	fmt.Println("Predicted NER Tag Types:", strings.Join(predictedTagStruct.NerTag, " "))
	fmt.Println("Predicted Phrase Tag Types:", strings.Join(predictedTagStruct.PhraseTag, " "))
	fmt.Println("Predicted Dependency Relation Tag Types:", strings.Join(predictedTagStruct.DepRelationsTag, " "))

}

*- clone it

git clone https://github.com/golangast/nlptagger
    • or
    • install gonew to pull down project quickly
go install golang.org/x/tools/cmd/gonew@latest
    • run gonew
gonew github.com/golangast/nlptagger example.com/nlptagger
    • cd into nlptagger =======
cd nlptagger
    • run the project
go run main.go

Repository overview

├── data #training data
│   └── training_data.json
├── nn #neural network
│   ├── modeldata.go
│   ├── nnu #neural network utils
│   └── simplenn #simple neural network
├── tagger #tagger folder
│   ├── dependencyrelation #dependency relation
│   ├── nertagger	#ner tagging
│   ├── phrasetagger #phraase tagging
│   ├── postagger #pos tagging
│   ├── stem #stemming tokens before tagging
│   ├── tag #tag data structure
│   └── tagger.go
└── trained_model.gob #model

Overview of the code.

*All this does is tag sentences and it is not a LLM but only is an attempt at tagging commands.

## Things to remember
* it is not a LLM or trying to be
* it is only for cli commands

Just added

*the project

Special thanks

Why Go?

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL