This chapter in this book is a little thin, but the author also explained that this book is an introductory practice book of NLP, and syntactic analysis belongs to a higher-level problem in NLP, so I didn't explain it in depth. I'm also an introductory NLP after learning this book. After learning this book, I will learn statistical natural lang ...
Posted by forcerecon on Tue, 24 May 2022 12:42:37 +0300
The original code comes from github. The specific website is: https://github.com/OustandingMan/LSTM-CRF
However, reading the corpus is not a template, but a code written by yourself to read the data
My understanding of deep learning:
Processing data: processing data into a format that can be read by the network
Network construction: vari ...
Posted by FraggleRock on Mon, 16 May 2022 16:11:02 +0300
Official Account: Notes on Data Mining and Machine Learning
Sentiment classification using CNN-LSTM, here is a binary classification model. It is divided into the following steps as a whole:
Environment and parameter settings
Model network structure construction and training
1. Environment and parameter setti ...
Posted by hkothari on Mon, 16 May 2022 12:42:47 +0300
1, Implementation steps of text classification:
Definition stage: define the data and classification system, which categories are divided and which data are needed Data preprocessing: prepare documents for word segmentation and de stop words Data extraction features: reduce the dimension of the document matrix and extract the most useful featu ...
Posted by lucerias on Mon, 16 May 2022 00:57:33 +0300
Note: batch here refers to mini batch
Two methods to realize sequence (text, log) batch processing
Fixed length batches (uniform length batches) All batch sequences have the same length. For example, seqs = [[1,2,3,3,4,5,6,7], [1,2,3], [2,4,1,2,3], [1,2,4,1]] batch_size = 2 Then the maximum sequence length is 8. If it is less than 8, fill it ...
Posted by sonic_2k_uk on Sat, 14 May 2022 05:21:32 +0300
TF-IDF model: analysis of epidemic text data based on stuttering word segmentation and wordcloud
Recently, we have made a text data analysis of China's policy on the COVID-19. Let's introduce the relevant knowledge to summarize and consolidate, and hope to help more people.
1, TF IDF: keyword extraction
Stop words: stop words are words o ...
Posted by Dasndan on Fri, 13 May 2022 00:46:36 +0300
[from the official case study framework Keras] seq2seq based on character LSTM
Keras official case link Tensorflow official case link Paddle official case link Pytoch official case link
Note: this series only helps you to quickly understand and learn, and can independently use the relevant framework for in-depth learning research. Please ...
Posted by sarah on Sun, 08 May 2022 07:29:41 +0300
Overview of this article: recurrence of knowledge- KG open source project set Medium BERT-NER-pytorch Some learning records after the project are of reference significance to Xiaobai, who is also a newcomer.
Data: about the introduction of transformer in BERT model, what must be shared is Animation of Jay Alammar , why didn't I see such a good ...
Posted by MartiniMan on Sun, 08 May 2022 05:09:16 +0300
Recently, LSTM has been used for text classification based on THUCNews dataset. The previous classification of 10 news categories with LSTM model can converge normally, which shows that it should not be the reason for the wrong code. However, when I expanded the news categories to 14 categories, the loss did not decrease: Because I don't know ...
Posted by narch31 on Tue, 03 May 2022 07:45:01 +0300
Put the contents of the previous sections together, muti head attention, positive encoding
import pandas as pd
from torch import nn
from d2l import torch as d2l
Position based feedforward network
The name is very tall. In fact, it is a single hidden layer MLP
class PositionWi ...
Posted by natbrazil on Mon, 25 Apr 2022 20:16:13 +0300