Revert "permet l'ajout des frameworks et des routes"

This reverts commit 361112699c
This commit is contained in:
Dario Duchateau-weinberger
2023-09-25 09:44:12 +02:00
parent 361112699c
commit 20cb812095
2787 changed files with 0 additions and 864804 deletions

View File

@@ -1,463 +0,0 @@
<p align="center">
<img src="./logo.png" alt="Hyntax project logo — lego bricks in the shape of a capital letter H" width="150">
</p>
# Hyntax
Straightforward HTML parser for JavaScript. [Live Demo](https://astexplorer.net/#/gist/6bf7f78077333cff124e619aebfb5b42/latest).
- **Simple.** API is straightforward, output is clear.
- **Forgiving.** Just like a browser, normally parses invalid HTML.
- **Supports streaming.** Can process HTML while it's still being loaded.
- **No dependencies.**
## Table Of Contents
- [Usage](#usage)
- [TypeScript Typings](#typescript-typings)
- [Streaming](#streaming)
- [Tokens](#tokens)
- [AST Format](#ast-format)
- [API Reference](#api-reference)
- [Types Reference](#types-reference)
## Usage
```bash
npm install hyntax
```
```javascript
const { tokenize, constructTree } = require('hyntax')
const util = require('util')
const inputHTML = `
<html>
<body>
<input type="text" placeholder="Don't type">
<button>Don't press</button>
</body>
</html>
`
const { tokens } = tokenize(inputHTML)
const { ast } = constructTree(tokens)
console.log(JSON.stringify(tokens, null, 2))
console.log(util.inspect(ast, { showHidden: false, depth: null }))
```
## TypeScript Typings
Hyntax is written in JavaScript but has [integrated TypeScript typings](./index.d.ts) to help you navigate around its data structures. There is also [Types Reference](#types-reference) which covers most common types.
## Streaming
Use `StreamTokenizer` and `StreamTreeConstructor` classes to parse HTML chunk by chunk while it's still being loaded from the network or read from the disk.
```javascript
const { StreamTokenizer, StreamTreeConstructor } = require('hyntax')
const http = require('http')
const util = require('util')
http.get('http://info.cern.ch', (res) => {
const streamTokenizer = new StreamTokenizer()
const streamTreeConstructor = new StreamTreeConstructor()
let resultTokens = []
let resultAst
res.pipe(streamTokenizer).pipe(streamTreeConstructor)
streamTokenizer
.on('data', (tokens) => {
resultTokens = resultTokens.concat(tokens)
})
.on('end', () => {
console.log(JSON.stringify(resultTokens, null, 2))
})
streamTreeConstructor
.on('data', (ast) => {
resultAst = ast
})
.on('end', () => {
console.log(util.inspect(resultAst, { showHidden: false, depth: null }))
})
}).on('error', (err) => {
throw err;
})
```
## Tokens
Here are all kinds of tokens which Hyntax will extract out of HTML string.
![Overview of all possible tokens](./tokens-list.png)
Each token conforms to [Tokenizer.Token](#TokenizerToken) interface.
## AST Format
Resulting syntax tree will have at least one top-level [Document Node](#ast-node-types) with optional children nodes nested within.
<!-- You can play around with the [AST Explorer](https://astexplorer.net) to see how AST looks like. -->
```javascript
{
nodeType: TreeConstructor.NodeTypes.Document,
content: {
children: [
{
nodeType: TreeConstructor.NodeTypes.AnyNodeType,
content: {}
},
{
nodeType: TreeConstructor.NodeTypes.AnyNodeType,
content: {}
}
]
}
}
```
Content of each node is specific to node's type, all of them are described in [AST Node Types](#ast-node-types) reference.
## API Reference
### Tokenizer
Hyntax has its tokenizer as a separate module. You can use generated tokens on their own or pass them further to a tree constructor to build an AST.
#### Interface
```typescript
tokenize(html: String): Tokenizer.Result
```
#### Arguments
- `html`
HTML string to process
Required.
Type: string.
#### Returns [Tokenizer.Result](#TokenizerResult)
### Tree Constructor
After you've got an array of tokens, you can pass them into tree constructor to build an AST.
#### Interface
```typescript
constructTree(tokens: Tokenizer.AnyToken[]): TreeConstructor.Result
```
#### Arguments
- `tokens`
Array of tokens received from the tokenizer.
Required.
Type: [Tokenizer.AnyToken[]](#tokenizeranytoken)
#### Returns [TreeConstructor.Result](#TreeConstructorResult)
## Types Reference
#### Tokenizer.Result
```typescript
interface Result {
state: Tokenizer.State
tokens: Tokenizer.AnyToken[]
}
```
- `state`
The current state of tokenizer. It can be persisted and passed to the next tokenizer call if the input is coming in chunks.
- `tokens`
Array of resulting tokens.
Type: [Tokenizer.AnyToken[]](#tokenizeranytoken)
#### TreeConstructor.Result
```typescript
interface Result {
state: State
ast: AST
}
```
- `state`
The current state of the tree constructor. Can be persisted and passed to the next tree constructor call in case when tokens are coming in chunks.
- `ast`
Resulting AST.
Type: [TreeConstructor.AST](#treeconstructorast)
#### Tokenizer.Token
Generic Token, other interfaces use it to create a specific Token type.
```typescript
interface Token<T extends TokenTypes.AnyTokenType> {
type: T
content: string
startPosition: number
endPosition: number
}
```
- `type`
One of the [Token types](#TokenizerTokenTypesAnyTokenType).
- `content `
Piece of original HTML string which was recognized as a token.
- `startPosition `
Index of a character in the input HTML string where the token starts.
- `endPosition`
Index of a character in the input HTML string where the token ends.
#### Tokenizer.TokenTypes.AnyTokenType
Shortcut type of all possible tokens.
```typescript
type AnyTokenType =
| Text
| OpenTagStart
| AttributeKey
| AttributeAssigment
| AttributeValueWrapperStart
| AttributeValue
| AttributeValueWrapperEnd
| OpenTagEnd
| CloseTag
| OpenTagStartScript
| ScriptTagContent
| OpenTagEndScript
| CloseTagScript
| OpenTagStartStyle
| StyleTagContent
| OpenTagEndStyle
| CloseTagStyle
| DoctypeStart
| DoctypeEnd
| DoctypeAttributeWrapperStart
| DoctypeAttribute
| DoctypeAttributeWrapperEnd
| CommentStart
| CommentContent
| CommentEnd
```
#### Tokenizer.AnyToken
Shortcut to reference any possible token.
```typescript
type AnyToken = Token<TokenTypes.AnyTokenType>
```
#### TreeConstructor.AST
Just an alias to DocumentNode. AST always has one top-level DocumentNode. See [AST Node Types](#ast-node-types)
```typescript
type AST = TreeConstructor.DocumentNode
```
### AST Node Types
There are 7 possible types of Node. Each type has a specific content.
```typescript
type DocumentNode = Node<NodeTypes.Document, NodeContents.Document>
```
```typescript
type DoctypeNode = Node<NodeTypes.Doctype, NodeContents.Doctype>
```
```typescript
type TextNode = Node<NodeTypes.Text, NodeContents.Text>
```
```typescript
type TagNode = Node<NodeTypes.Tag, NodeContents.Tag>
```
```typescript
type CommentNode = Node<NodeTypes.Comment, NodeContents.Comment>
```
```typescript
type ScriptNode = Node<NodeTypes.Script, NodeContents.Script>
```
```typescript
type StyleNode = Node<NodeTypes.Style, NodeContents.Style>
```
Interfaces for each content type:
- [Document](#TreeConstructorNodeContentsDocument)
- [Doctype](#TreeConstructorNodeContentsDoctype)
- [Text](#TreeConstructorNodeContentsText)
- [Tag](#TreeConstructorNodeContentsTag)
- [Comment](#TreeConstructorNodeContentsComment)
- [Script](#TreeConstructorNodeContentsScript)
- [Style](#TreeConstructorNodeContentsStyle)
#### TreeConstructor.Node
Generic Node, other interfaces use it to create specific Nodes by providing type of Node and type of the content inside the Node.
```typescript
interface Node<T extends NodeTypes.AnyNodeType, C extends NodeContents.AnyNodeContent> {
nodeType: T
content: C
}
```
#### TreeConstructor.NodeTypes.AnyNodeType
Shortcut type of all possible Node types.
```typescript
type AnyNodeType =
| Document
| Doctype
| Tag
| Text
| Comment
| Script
| Style
```
### Node Content Types
#### TreeConstructor.NodeTypes.AnyNodeContent
Shortcut type of all possible types of content inside a Node.
```typescript
type AnyNodeContent =
| Document
| Doctype
| Text
| Tag
| Comment
| Script
| Style
```
#### TreeConstructor.NodeContents.Document
```typescript
interface Document {
children: AnyNode[]
}
```
#### TreeConstructor.NodeContents.Doctype
```typescript
interface Doctype {
start: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeStart>
attributes?: DoctypeAttribute[]
end: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeEnd>
}
```
#### TreeConstructor.NodeContents.Text
```typescript
interface Text {
value: Tokenizer.Token<Tokenizer.TokenTypes.Text>
}
```
#### TreeConstructor.NodeContents.Tag
```typescript
interface Tag {
name: string
selfClosing: boolean
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStart>
attributes?: TagAttribute[]
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEnd>
children?: AnyNode[]
close?: Tokenizer.Token<Tokenizer.TokenTypes.CloseTag>
}
```
#### TreeConstructor.NodeContents.Comment
```typescript
interface Comment {
start: Tokenizer.Token<Tokenizer.TokenTypes.CommentStart>
value: Tokenizer.Token<Tokenizer.TokenTypes.CommentContent>
end: Tokenizer.Token<Tokenizer.TokenTypes.CommentEnd>
}
```
#### TreeConstructor.NodeContents.Script
```typescript
interface Script {
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartScript>
attributes?: TagAttribute[]
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndScript>
value: Tokenizer.Token<Tokenizer.TokenTypes.ScriptTagContent>
close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagScript>
}
```
#### TreeConstructor.NodeContents.Style
```typescript
interface Style {
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartStyle>,
attributes?: TagAttribute[],
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndStyle>,
value: Tokenizer.Token<Tokenizer.TokenTypes.StyleTagContent>,
close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagStyle>
}
```
#### TreeConstructor.DoctypeAttribute
```typescript
interface DoctypeAttribute {
startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperStart>,
value: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttribute>,
endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperEnd>
}
```
#### TreeConstructor.TagAttribute
```typescript
interface TagAttribute {
key?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeKey>,
startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperStart>,
value?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValue>,
endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperEnd>
}
```

View File

@@ -1,467 +0,0 @@
<p align="center">
<img src="./logo.png" alt="Hyntax project logo — lego bricks in the shape of a capital letter H" width="150">
</p>
# Hyntax
Straightforward HTML parser for JavaScript. [Live Demo](https://astexplorer.net/#/gist/6bf7f78077333cff124e619aebfb5b42/latest).
- **Simple.** API is straightforward, output is clear.
- **Forgiving.** Just like a browser, normally parses invalid HTML.
- **Supports streaming.** Can process HTML while it's still being loaded.
- **No dependencies.**
## Table Of Contents
- [Usage](#usage)
- [TypeScript Typings](#typescript-typings)
- [Streaming](#streaming)
- [Tokens](#tokens)
- [AST Format](#ast-format)
- [API Reference](#api-reference)
- [Types Reference](#types-reference)
## Usage
```bash
npm install hyntax
```
```javascript
const { tokenize, constructTree } = require('hyntax')
const util = require('util')
const inputHTML = `
<html>
<body>
<input type="text" placeholder="Don't type">
<button>Don't press</button>
</body>
</html>
`
const { tokens } = tokenize(inputHTML)
const { ast } = constructTree(tokens)
console.log(JSON.stringify(tokens, null, 2))
console.log(util.inspect(ast, { showHidden: false, depth: null }))
```
## TypeScript Typings
Hyntax is written in JavaScript but has [integrated TypeScript typings](./index.d.ts) to help you navigate around its data structures. There is also [Types Reference](#types-reference) which covers most common types.
## Streaming
Use `StreamTokenizer` and `StreamTreeConstructor` classes to parse HTML chunk by chunk while it's still being loaded from the network or read from the disk.
```javascript
const { StreamTokenizer, StreamTreeConstructor } = require('hyntax')
const http = require('http')
const util = require('util')
http.get('http://info.cern.ch', (res) => {
const streamTokenizer = new StreamTokenizer()
const streamTreeConstructor = new StreamTreeConstructor()
let resultTokens = []
let resultAst
res.pipe(streamTokenizer).pipe(streamTreeConstructor)
streamTokenizer
.on('data', (tokens) => {
resultTokens = resultTokens.concat(tokens)
})
.on('end', () => {
console.log(JSON.stringify(resultTokens, null, 2))
})
streamTreeConstructor
.on('data', (ast) => {
resultAst = ast
})
.on('end', () => {
console.log(util.inspect(resultAst, { showHidden: false, depth: null }))
})
}).on('error', (err) => {
throw err;
})
```
## Tokens
Here are all kinds of tokens which Hyntax will extract out of HTML string.
![Overview of all possible tokens](./tokens-list.png)
Each token conforms to [Tokenizer.Token](#TokenizerToken) interface.
## AST Format
Resulting syntax tree will have at least one top-level [Document Node](#ast-node-types) with optional children nodes nested within.
<!-- You can play around with the [AST Explorer](https://astexplorer.net) to see how AST looks like. -->
```javascript
{
nodeType: TreeConstructor.NodeTypes.Document,
content: {
children: [
{
nodeType: TreeConstructor.NodeTypes.AnyNodeType,
content: {}
},
{
nodeType: TreeConstructor.NodeTypes.AnyNodeType,
content: {}
}
]
}
}
```
Content of each node is specific to node's type, all of them are described in [AST Node Types](#ast-node-types) reference.
## API Reference
### Tokenizer
Hyntax has its tokenizer as a separate module. You can use generated tokens on their own or pass them further to a tree constructor to build an AST.
#### Interface
```typescript
tokenize(html: String): Tokenizer.Result
```
#### Arguments
- `html`
HTML string to process
Required.
Type: string.
#### Returns [Tokenizer.Result](#TokenizerResult)
### Tree Constructor
After you've got an array of tokens, you can pass them into tree constructor to build an AST.
#### Interface
```typescript
constructTree(tokens: Tokenizer.AnyToken[]): TreeConstructor.Result
```
#### Arguments
- `tokens`
Array of tokens received from the tokenizer.
Required.
Type: [Tokenizer.AnyToken[]](#tokenizeranytoken)
#### Returns [TreeConstructor.Result](#TreeConstructorResult)
## Types Reference
#### Tokenizer.Result
```typescript
interface Result {
state: Tokenizer.State
tokens: Tokenizer.AnyToken[]
}
```
- `state`
The current state of tokenizer. It can be persisted and passed to the next tokenizer call if the input is coming in chunks.
- `tokens`
Array of resulting tokens.
Type: [Tokenizer.AnyToken[]](#tokenizeranytoken)
#### TreeConstructor.Result
```typescript
interface Result {
state: State
ast: AST
}
```
- `state`
The current state of the tree constructor. Can be persisted and passed to the next tree constructor call in case when tokens are coming in chunks.
- `ast`
Resulting AST.
Type: [TreeConstructor.AST](#treeconstructorast)
#### Tokenizer.Token
Generic Token, other interfaces use it to create a specific Token type.
```typescript
interface Token<T extends TokenTypes.AnyTokenType> {
type: T
content: string
startPosition: number
endPosition: number
}
```
- `type`
One of the [Token types](#TokenizerTokenTypesAnyTokenType).
- `content `
Piece of original HTML string which was recognized as a token.
- `startPosition `
Index of a character in the input HTML string where the token starts.
- `endPosition`
Index of a character in the input HTML string where the token ends.
#### Tokenizer.TokenTypes.AnyTokenType
Shortcut type of all possible tokens.
```typescript
type AnyTokenType =
| Text
| OpenTagStart
| AttributeKey
| AttributeAssigment
| AttributeValueWrapperStart
| AttributeValue
| AttributeValueWrapperEnd
| OpenTagEnd
| CloseTag
| OpenTagStartScript
| ScriptTagContent
| OpenTagEndScript
| CloseTagScript
| OpenTagStartStyle
| StyleTagContent
| OpenTagEndStyle
| CloseTagStyle
| DoctypeStart
| DoctypeEnd
| DoctypeAttributeWrapperStart
| DoctypeAttribute
| DoctypeAttributeWrapperEnd
| CommentStart
| CommentContent
| CommentEnd
```
#### Tokenizer.AnyToken
Shortcut to reference any possible token.
```typescript
type AnyToken = Token<TokenTypes.AnyTokenType>
```
#### TreeConstructor.AST
Just an alias to DocumentNode. AST always has one top-level DocumentNode. See [AST Node Types](#ast-node-types)
```typescript
type AST = TreeConstructor.DocumentNode
```
### AST Node Types
There are 7 possible types of Node. Each type has a specific content.
```typescript
type DocumentNode = Node<NodeTypes.Document, NodeContents.Document>
```
```typescript
type DoctypeNode = Node<NodeTypes.Doctype, NodeContents.Doctype>
```
```typescript
type TextNode = Node<NodeTypes.Text, NodeContents.Text>
```
```typescript
type TagNode = Node<NodeTypes.Tag, NodeContents.Tag>
```
```typescript
type CommentNode = Node<NodeTypes.Comment, NodeContents.Comment>
```
```typescript
type ScriptNode = Node<NodeTypes.Script, NodeContents.Script>
```
```typescript
type StyleNode = Node<NodeTypes.Style, NodeContents.Style>
```
Interfaces for each content type:
- [Document](#TreeConstructorNodeContentsDocument)
- [Doctype](#TreeConstructorNodeContentsDoctype)
- [Text](#TreeConstructorNodeContentsText)
- [Tag](#TreeConstructorNodeContentsTag)
- [Comment](#TreeConstructorNodeContentsComment)
- [Script](#TreeConstructorNodeContentsScript)
- [Style](#TreeConstructorNodeContentsStyle)
#### TreeConstructor.Node
Generic Node, other interfaces use it to create specific Nodes by providing type of Node and type of the content inside the Node.
```typescript
interface Node<T extends NodeTypes.AnyNodeType, C extends NodeContents.AnyNodeContent> {
nodeType: T
content: C
}
```
#### TreeConstructor.NodeTypes.AnyNodeType
Shortcut type of all possible Node types.
```typescript
type AnyNodeType =
| Document
| Doctype
| Tag
| Text
| Comment
| Script
| Style
```
### Node Content Types
#### TreeConstructor.NodeTypes.AnyNodeContent
Shortcut type of all possible types of content inside a Node.
```typescript
type AnyNodeContent =
| Document
| Doctype
| Text
| Tag
| Comment
| Script
| Style
```
#### TreeConstructor.NodeContents.Document
```typescript
interface Document {
children: AnyNode[]
}
```
#### TreeConstructor.NodeContents.Doctype
```typescript
interface Doctype {
start: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeStart>
attributes?: DoctypeAttribute[]
end: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeEnd>
}
```
#### TreeConstructor.NodeContents.Text
```typescript
interface Text {
value: Tokenizer.Token<Tokenizer.TokenTypes.Text>
}
```
#### TreeConstructor.NodeContents.Tag
```typescript
interface Tag {
name: string
selfClosing: boolean
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStart>
attributes?: TagAttribute[]
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEnd>
children?: AnyNode[]
close?: Tokenizer.Token<Tokenizer.TokenTypes.CloseTag>
}
```
#### TreeConstructor.NodeContents.Comment
```typescript
interface Comment {
start: Tokenizer.Token<Tokenizer.TokenTypes.CommentStart>
value: Tokenizer.Token<Tokenizer.TokenTypes.CommentContent>
end: Tokenizer.Token<Tokenizer.TokenTypes.CommentEnd>
}
```
#### TreeConstructor.NodeContents.Script
```typescript
interface Script {
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartScript>
attributes?: TagAttribute[]
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndScript>
value: Tokenizer.Token<Tokenizer.TokenTypes.ScriptTagContent>
close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagScript>
}
```
#### TreeConstructor.NodeContents.Style
```typescript
interface Style {
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartStyle>,
attributes?: TagAttribute[],
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndStyle>,
value: Tokenizer.Token<Tokenizer.TokenTypes.StyleTagContent>,
close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagStyle>
}
```
#### TreeConstructor.DoctypeAttribute
```typescript
interface DoctypeAttribute {
startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperStart>,
value: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttribute>,
endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperEnd>
}
```
#### TreeConstructor.TagAttribute
```typescript
interface TagAttribute {
key?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeKey>,
startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperStart>,
value?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValue>,
endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperEnd>
}
```

View File

@@ -1,298 +0,0 @@
import { Transform } from 'stream'
declare function tokenize(
html: string,
existingState?: Tokenizer.State,
options?: Tokenizer.Options
): Tokenizer.Result
declare function constructTree(
tokens: Tokenizer.AnyToken[],
existingState?: TreeConstructor.State
): TreeConstructor.Result
declare class StreamTokenizer extends Transform {}
declare class StreamTreeConstructor extends Transform {}
export namespace Tokenizer {
namespace ContextTypes {
type Data = 'tokenizer-context:data'
type OpenTagStart = 'tokenizer-context:open-tag-start'
type CloseTag = 'tokenizer-context:close-tag'
type Attributes = 'tokenizer-context:attributes'
type OpenTagEnd = 'tokenizer-context:open-tag-end'
type AttributeKey = 'tokenizer-context:attribute-key'
type AttributeValue = 'tokenizer-context:attribute-value'
type AttributeValueBare = 'tokenizer-context:attribute-value-bare'
type AttributeValueWrapped = 'tokenizer-context:attribute-value-wrapped'
type ScriptContent = 'tokenizer-context:script-content'
type StyleContent = 'tokenizer-context:style-content'
type DoctypeStart = 'tokenizer-context:doctype-start'
type DoctypeEnd = 'tokenizer-context:doctype-end'
type DoctypeAttributes = 'tokenizer-context:doctype-attributes'
type DoctypeAttributeWrapped = 'tokenizer-context:doctype-attribute-wrapped'
type DoctypeAttributeBare = 'tokenizer-context:doctype-attribute-bare'
type CommentStart = 'tokenizer-context:comment-start'
type CommentContent = 'tokenizer-context:comment-content'
type CommentEnd = 'tokenizer-context:comment-end'
type AnyContextType =
| Data
| OpenTagStart
| CloseTag
| Attributes
| OpenTagEnd
| AttributeKey
| AttributeValue
| AttributeValueBare
| AttributeValueWrapped
| ScriptContent
| StyleContent
| DoctypeStart
| DoctypeEnd
| DoctypeAttributes
| DoctypeAttributeWrapped
| DoctypeAttributeBare
| CommentStart
| CommentContent
| CommentEnd
}
namespace TokenTypes {
type Text = 'token:text'
type OpenTagStart = 'token:open-tag-start'
type AttributeKey = 'token:attribute-key'
type AttributeAssigment = 'token:attribute-assignment'
type AttributeValueWrapperStart = 'token:attribute-value-wrapper-start'
type AttributeValue = 'token:attribute-value'
type AttributeValueWrapperEnd = 'token:attribute-value-wrapper-end'
type OpenTagEnd = 'token:open-tag-end'
type CloseTag = 'token:close-tag'
type OpenTagStartScript = 'token:open-tag-start-script'
type ScriptTagContent = 'token:script-tag-content'
type OpenTagEndScript = 'token:open-tag-end-script'
type CloseTagScript = 'token:close-tag-script'
type OpenTagStartStyle = 'token:open-tag-start-style'
type StyleTagContent = 'token:style-tag-content'
type OpenTagEndStyle = 'token:open-tag-end-style'
type CloseTagStyle = 'token:close-tag-style'
type DoctypeStart = 'token:doctype-start'
type DoctypeEnd = 'token:doctype-end'
type DoctypeAttributeWrapperStart = 'token:doctype-attribute-wrapper-start'
type DoctypeAttribute = 'token:doctype-attribute'
type DoctypeAttributeWrapperEnd = 'token:doctype-attribute-wrapper-end'
type CommentStart = 'token:comment-start'
type CommentContent = 'token:comment-content'
type CommentEnd = 'token:comment-end'
type AnyTokenType =
| Text
| OpenTagStart
| AttributeKey
| AttributeAssigment
| AttributeValueWrapperStart
| AttributeValue
| AttributeValueWrapperEnd
| OpenTagEnd
| CloseTag
| OpenTagStartScript
| ScriptTagContent
| OpenTagEndScript
| CloseTagScript
| OpenTagStartStyle
| StyleTagContent
| OpenTagEndStyle
| CloseTagStyle
| DoctypeStart
| DoctypeEnd
| DoctypeAttributeWrapperStart
| DoctypeAttribute
| DoctypeAttributeWrapperEnd
| CommentStart
| CommentContent
| CommentEnd
}
interface Options {
isFinalChunk: boolean
}
interface State {
currentContext: string
contextParams: ContextParams
decisionBuffer: string
accumulatedContent: string
caretPosition: number
}
interface Result {
state: State
tokens: AnyToken[]
}
type AnyToken = Token<TokenTypes.AnyTokenType>
interface Token<T extends TokenTypes.AnyTokenType> {
type: T
content: string
startPosition: number
endPosition: number
}
type ContextParams = {
[C in ContextTypes.AnyContextType]?: {
wrapper?: '"' | '\'',
tagName?: string
}
}
}
export namespace TreeConstructor {
namespace NodeTypes {
type Document = 'document'
type Doctype = 'doctype'
type Tag = 'tag'
type Text = 'text'
type Comment = 'comment'
type Script = 'script'
type Style = 'style'
type AnyNodeType =
| Document
| Doctype
| Tag
| Text
| Comment
| Script
| Style
}
namespace ContextTypes {
type TagContent = 'tree-constructor-context:tag-content'
type Tag = 'tree-constructor-context:tag'
type TagName = 'tree-constructor-context:tag-name'
type Attributes = 'tree-constructor-context:attributes'
type Attribute = 'tree-constructor-context:attribute'
type AttributeValue = 'tree-constructor-context:attribute-value'
type Comment = 'tree-constructor-context:comment'
type Doctype = 'tree-constructor-context:doctype'
type DoctypeAttributes = 'tree-constructor-context:doctype-attributes'
type DoctypeAttribute = 'tree-constructor-context:doctype-attribute'
type ScriptTag = 'tree-constructor-context:script-tag'
type StyleTag = 'tree-constructor-context:style-tag'
type AnyContextType =
| TagContent
| Tag
| TagName
| Attributes
| Attribute
| AttributeValue
| Comment
| Doctype
| DoctypeAttributes
| DoctypeAttribute
| ScriptTag
| StyleTag
}
namespace NodeContents {
interface Document {
children: AnyNode[]
}
interface Doctype {
start: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeStart>
attributes?: DoctypeAttribute[]
end: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeEnd>
}
interface Text {
value: Tokenizer.Token<Tokenizer.TokenTypes.Text>
}
interface Tag {
name: string
selfClosing: boolean
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStart>
attributes?: TagAttribute[]
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEnd>
children?: AnyNode[]
close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTag>
}
interface Comment {
start: Tokenizer.Token<Tokenizer.TokenTypes.CommentStart>
value: Tokenizer.Token<Tokenizer.TokenTypes.CommentContent>
end: Tokenizer.Token<Tokenizer.TokenTypes.CommentEnd>
}
interface Script {
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartScript>
attributes?: TagAttribute[]
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndScript>
value: Tokenizer.Token<Tokenizer.TokenTypes.ScriptTagContent>
close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagScript>
}
interface Style {
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartStyle>,
attributes?: TagAttribute[],
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndStyle>,
value: Tokenizer.Token<Tokenizer.TokenTypes.StyleTagContent>,
close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagStyle>
}
type AnyNodeContent =
| Document
| Doctype
| Text
| Tag
| Comment
| Script
| Style
}
interface State {
caretPosition: number
currentContext: ContextTypes.AnyContextType
currentNode: NodeTypes.AnyNodeType
rootNode: NodeTypes.Document
}
interface Result {
state: State
ast: AST
}
type AST = DocumentNode
interface Node<T extends NodeTypes.AnyNodeType, C extends NodeContents.AnyNodeContent> {
nodeType: T
content: C
}
type AnyNode = Node<NodeTypes.AnyNodeType, NodeContents.AnyNodeContent>
type DocumentNode = Node<NodeTypes.Document, NodeContents.Document>
type DoctypeNode = Node<NodeTypes.Doctype, NodeContents.Doctype>
type TextNode = Node<NodeTypes.Text, NodeContents.Text>
type TagNode = Node<NodeTypes.Tag, NodeContents.Tag>
type CommentNode = Node<NodeTypes.Comment, NodeContents.Comment>
type ScriptNode = Node<NodeTypes.Script, NodeContents.Script>
type StyleNode = Node<NodeTypes.Style, NodeContents.Style>
interface DoctypeAttribute {
startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperStart>,
value: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttribute>,
endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperEnd>
}
interface TagAttribute {
key?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeKey>,
startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperStart>,
value?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValue>,
endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperEnd>
}
}

View File

@@ -1,298 +0,0 @@
import { Transform } from 'stream'
declare function tokenize(
html: string,
existingState?: Tokenizer.State,
options?: Tokenizer.Options
): Tokenizer.Result
declare function constructTree(
tokens: Tokenizer.AnyToken[],
existingState?: TreeConstructor.State
): TreeConstructor.Result
declare class StreamTokenizer extends Transform {}
declare class StreamTreeConstructor extends Transform {}
export namespace Tokenizer {
namespace ContextTypes {
type Data = 'tokenizer-context:data'
type OpenTagStart = 'tokenizer-context:open-tag-start'
type CloseTag = 'tokenizer-context:close-tag'
type Attributes = 'tokenizer-context:attributes'
type OpenTagEnd = 'tokenizer-context:open-tag-end'
type AttributeKey = 'tokenizer-context:attribute-key'
type AttributeValue = 'tokenizer-context:attribute-value'
type AttributeValueBare = 'tokenizer-context:attribute-value-bare'
type AttributeValueWrapped = 'tokenizer-context:attribute-value-wrapped'
type ScriptContent = 'tokenizer-context:script-content'
type StyleContent = 'tokenizer-context:style-content'
type DoctypeStart = 'tokenizer-context:doctype-start'
type DoctypeEnd = 'tokenizer-context:doctype-end'
type DoctypeAttributes = 'tokenizer-context:doctype-attributes'
type DoctypeAttributeWrapped = 'tokenizer-context:doctype-attribute-wrapped'
type DoctypeAttributeBare = 'tokenizer-context:doctype-attribute-bare'
type CommentStart = 'tokenizer-context:comment-start'
type CommentContent = 'tokenizer-context:comment-content'
type CommentEnd = 'tokenizer-context:comment-end'
type AnyContextType =
| Data
| OpenTagStart
| CloseTag
| Attributes
| OpenTagEnd
| AttributeKey
| AttributeValue
| AttributeValueBare
| AttributeValueWrapped
| ScriptContent
| StyleContent
| DoctypeStart
| DoctypeEnd
| DoctypeAttributes
| DoctypeAttributeWrapped
| DoctypeAttributeBare
| CommentStart
| CommentContent
| CommentEnd
}
namespace TokenTypes {
type Text = 'token:text'
type OpenTagStart = 'token:open-tag-start'
type AttributeKey = 'token:attribute-key'
type AttributeAssigment = 'token:attribute-assignment'
type AttributeValueWrapperStart = 'token:attribute-value-wrapper-start'
type AttributeValue = 'token:attribute-value'
type AttributeValueWrapperEnd = 'token:attribute-value-wrapper-end'
type OpenTagEnd = 'token:open-tag-end'
type CloseTag = 'token:close-tag'
type OpenTagStartScript = 'token:open-tag-start-script'
type ScriptTagContent = 'token:script-tag-content'
type OpenTagEndScript = 'token:open-tag-end-script'
type CloseTagScript = 'token:close-tag-script'
type OpenTagStartStyle = 'token:open-tag-start-style'
type StyleTagContent = 'token:style-tag-content'
type OpenTagEndStyle = 'token:open-tag-end-style'
type CloseTagStyle = 'token:close-tag-style'
type DoctypeStart = 'token:doctype-start'
type DoctypeEnd = 'token:doctype-end'
type DoctypeAttributeWrapperStart = 'token:doctype-attribute-wrapper-start'
type DoctypeAttribute = 'token:doctype-attribute'
type DoctypeAttributeWrapperEnd = 'token:doctype-attribute-wrapper-end'
type CommentStart = 'token:comment-start'
type CommentContent = 'token:comment-content'
type CommentEnd = 'token:comment-end'
type AnyTokenType =
| Text
| OpenTagStart
| AttributeKey
| AttributeAssigment
| AttributeValueWrapperStart
| AttributeValue
| AttributeValueWrapperEnd
| OpenTagEnd
| CloseTag
| OpenTagStartScript
| ScriptTagContent
| OpenTagEndScript
| CloseTagScript
| OpenTagStartStyle
| StyleTagContent
| OpenTagEndStyle
| CloseTagStyle
| DoctypeStart
| DoctypeEnd
| DoctypeAttributeWrapperStart
| DoctypeAttribute
| DoctypeAttributeWrapperEnd
| CommentStart
| CommentContent
| CommentEnd
}
interface Options {
isFinalChunk: boolean
}
interface State {
currentContext: string
contextParams: ContextParams
decisionBuffer: string
accumulatedContent: string
caretPosition: number
}
interface Result {
state: State
tokens: AnyToken[]
}
type AnyToken = Token<TokenTypes.AnyTokenType>
interface Token<T extends TokenTypes.AnyTokenType> {
type: T
content: string
startPosition: number
endPosition: number
}
type ContextParams = {
[C in ContextTypes.AnyContextType]?: {
wrapper?: '"' | '\'',
tagName?: string
}
}
}
export namespace TreeConstructor {
namespace NodeTypes {
type Document = 'document'
type Doctype = 'doctype'
type Tag = 'tag'
type Text = 'text'
type Comment = 'comment'
type Script = 'script'
type Style = 'style'
type AnyNodeType =
| Document
| Doctype
| Tag
| Text
| Comment
| Script
| Style
}
namespace ContextTypes {
type TagContent = 'tree-constructor-context:tag-content'
type Tag = 'tree-constructor-context:tag'
type TagName = 'tree-constructor-context:tag-name'
type Attributes = 'tree-constructor-context:attributes'
type Attribute = 'tree-constructor-context:attribute'
type AttributeValue = 'tree-constructor-context:attribute-value'
type Comment = 'tree-constructor-context:comment'
type Doctype = 'tree-constructor-context:doctype'
type DoctypeAttributes = 'tree-constructor-context:doctype-attributes'
type DoctypeAttribute = 'tree-constructor-context:doctype-attribute'
type ScriptTag = 'tree-constructor-context:script-tag'
type StyleTag = 'tree-constructor-context:style-tag'
type AnyContextType =
| TagContent
| Tag
| TagName
| Attributes
| Attribute
| AttributeValue
| Comment
| Doctype
| DoctypeAttributes
| DoctypeAttribute
| ScriptTag
| StyleTag
}
namespace NodeContents {
interface Document {
children: AnyNode[]
}
interface Doctype {
start: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeStart>
attributes?: DoctypeAttribute[]
end: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeEnd>
}
interface Text {
value: Tokenizer.Token<Tokenizer.TokenTypes.Text>
}
interface Tag {
name: string
selfClosing: boolean
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStart>
attributes?: TagAttribute[]
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEnd>
children?: AnyNode[]
close?: Tokenizer.Token<Tokenizer.TokenTypes.CloseTag>
}
interface Comment {
start: Tokenizer.Token<Tokenizer.TokenTypes.CommentStart>
value: Tokenizer.Token<Tokenizer.TokenTypes.CommentContent>
end: Tokenizer.Token<Tokenizer.TokenTypes.CommentEnd>
}
interface Script {
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartScript>
attributes?: TagAttribute[]
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndScript>
value: Tokenizer.Token<Tokenizer.TokenTypes.ScriptTagContent>
close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagScript>
}
interface Style {
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartStyle>,
attributes?: TagAttribute[],
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndStyle>,
value: Tokenizer.Token<Tokenizer.TokenTypes.StyleTagContent>,
close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagStyle>
}
type AnyNodeContent =
| Document
| Doctype
| Text
| Tag
| Comment
| Script
| Style
}
interface State {
caretPosition: number
currentContext: ContextTypes.AnyContextType
currentNode: NodeTypes.AnyNodeType
rootNode: NodeTypes.Document
}
interface Result {
state: State
ast: AST
}
type AST = DocumentNode
interface Node<T extends NodeTypes.AnyNodeType, C extends NodeContents.AnyNodeContent> {
nodeType: T
content: C
}
type AnyNode = Node<NodeTypes.AnyNodeType, NodeContents.AnyNodeContent>
type DocumentNode = Node<NodeTypes.Document, NodeContents.Document>
type DoctypeNode = Node<NodeTypes.Doctype, NodeContents.Doctype>
type TextNode = Node<NodeTypes.Text, NodeContents.Text>
type TagNode = Node<NodeTypes.Tag, NodeContents.Tag>
type CommentNode = Node<NodeTypes.Comment, NodeContents.Comment>
type ScriptNode = Node<NodeTypes.Script, NodeContents.Script>
type StyleNode = Node<NodeTypes.Style, NodeContents.Style>
interface DoctypeAttribute {
startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperStart>,
value: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttribute>,
endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperEnd>
}
interface TagAttribute {
key?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeKey>,
startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperStart>,
value?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValue>,
endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperEnd>
}
}

View File

@@ -1,298 +0,0 @@
import { Transform } from 'stream'
declare function tokenize(
html: string,
existingState?: Tokenizer.State,
options?: Tokenizer.Options
): Tokenizer.Result
declare function constructTree(
tokens: Tokenizer.AnyToken[],
existingState?: TreeConstructor.State
): TreeConstructor.Result
declare class StreamTokenizer extends Transform {}
declare class StreamTreeConstructor extends Transform {}
export namespace Tokenizer {
namespace ContextTypes {
type Data = 'tokenizer-context:data'
type OpenTagStart = 'tokenizer-context:open-tag-start'
type CloseTag = 'tokenizer-context:close-tag'
type Attributes = 'tokenizer-context:attributes'
type OpenTagEnd = 'tokenizer-context:open-tag-end'
type AttributeKey = 'tokenizer-context:attribute-key'
type AttributeValue = 'tokenizer-context:attribute-value'
type AttributeValueBare = 'tokenizer-context:attribute-value-bare'
type AttributeValueWrapped = 'tokenizer-context:attribute-value-wrapped'
type ScriptContent = 'tokenizer-context:script-content'
type StyleContent = 'tokenizer-context:style-content'
type DoctypeStart = 'tokenizer-context:doctype-start'
type DoctypeEnd = 'tokenizer-context:doctype-end'
type DoctypeAttributes = 'tokenizer-context:doctype-attributes'
type DoctypeAttributeWrapped = 'tokenizer-context:doctype-attribute-wrapped'
type DoctypeAttributeBare = 'tokenizer-context:doctype-attribute-bare'
type CommentStart = 'tokenizer-context:comment-start'
type CommentContent = 'tokenizer-context:comment-content'
type CommentEnd = 'tokenizer-context:comment-end'
type AnyContextType =
| Data
| OpenTagStart
| CloseTag
| Attributes
| OpenTagEnd
| AttributeKey
| AttributeValue
| AttributeValueBare
| AttributeValueWrapped
| ScriptContent
| StyleContent
| DoctypeStart
| DoctypeEnd
| DoctypeAttributes
| DoctypeAttributeWrapped
| DoctypeAttributeBare
| CommentStart
| CommentContent
| CommentEnd
}
namespace TokenTypes {
type Text = 'token:text'
type OpenTagStart = 'token:open-tag-start'
type AttributeKey = 'token:attribute-key'
type AttributeAssigment = 'token:attribute-assignment'
type AttributeValueWrapperStart = 'token:attribute-value-wrapper-start'
type AttributeValue = 'token:attribute-value'
type AttributeValueWrapperEnd = 'token:attribute-value-wrapper-end'
type OpenTagEnd = 'token:open-tag-end'
type CloseTag = 'token:close-tag'
type OpenTagStartScript = 'token:open-tag-start-script'
type ScriptTagContent = 'token:script-tag-content'
type OpenTagEndScript = 'token:open-tag-end-script'
type CloseTagScript = 'token:close-tag-script'
type OpenTagStartStyle = 'token:open-tag-start-style'
type StyleTagContent = 'token:style-tag-content'
type OpenTagEndStyle = 'token:open-tag-end-style'
type CloseTagStyle = 'token:close-tag-style'
type DoctypeStart = 'token:doctype-start'
type DoctypeEnd = 'token:doctype-end'
type DoctypeAttributeWrapperStart = 'token:doctype-attribute-wrapper-start'
type DoctypeAttribute = 'token:doctype-attribute'
type DoctypeAttributeWrapperEnd = 'token:doctype-attribute-wrapper-end'
type CommentStart = 'token:comment-start'
type CommentContent = 'token:comment-content'
type CommentEnd = 'token:comment-end'
type AnyTokenType =
| Text
| OpenTagStart
| AttributeKey
| AttributeAssigment
| AttributeValueWrapperStart
| AttributeValue
| AttributeValueWrapperEnd
| OpenTagEnd
| CloseTag
| OpenTagStartScript
| ScriptTagContent
| OpenTagEndScript
| CloseTagScript
| OpenTagStartStyle
| StyleTagContent
| OpenTagEndStyle
| CloseTagStyle
| DoctypeStart
| DoctypeEnd
| DoctypeAttributeWrapperStart
| DoctypeAttribute
| DoctypeAttributeWrapperEnd
| CommentStart
| CommentContent
| CommentEnd
}
interface Options {
isFinalChunk: boolean
}
interface State {
currentContext: string
contextParams: ContextParams
decisionBuffer: string
accumulatedContent: string
caretPosition: number
}
interface Result {
state: State
tokens: AnyToken[]
}
type AnyToken = Token<TokenTypes.AnyTokenType>
interface Token<T extends TokenTypes.AnyTokenType> {
type: T
content: string
startPosition: number
endPosition: number
}
type ContextParams = {
[C in ContextTypes.AnyContextType]?: {
wrapper?: '"' | '\'',
tagName?: string
}
}
}
export namespace TreeConstructor {
namespace NodeTypes {
type Document = 'document'
type Doctype = 'doctype'
type Tag = 'tag'
type Text = 'text'
type Comment = 'comment'
type Script = 'script'
type Style = 'style'
type AnyNodeType =
| Document
| Doctype
| Tag
| Text
| Comment
| Script
| Style
}
namespace ContextTypes {
type TagContent = 'tree-constructor-context:tag-content'
type Tag = 'tree-constructor-context:tag'
type TagName = 'tree-constructor-context:tag-name'
type Attributes = 'tree-constructor-context:attributes'
type Attribute = 'tree-constructor-context:attribute'
type AttributeValue = 'tree-constructor-context:attribute-value'
type Comment = 'tree-constructor-context:comment'
type Doctype = 'tree-constructor-context:doctype'
type DoctypeAttributes = 'tree-constructor-context:doctype-attributes'
type DoctypeAttribute = 'tree-constructor-context:doctype-attribute'
type ScriptTag = 'tree-constructor-context:script-tag'
type StyleTag = 'tree-constructor-context:style-tag'
type AnyContextType =
| TagContent
| Tag
| TagName
| Attributes
| Attribute
| AttributeValue
| Comment
| Doctype
| DoctypeAttributes
| DoctypeAttribute
| ScriptTag
| StyleTag
}
namespace NodeContents {
interface Document {
children: AnyNode[]
}
interface Doctype {
start: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeStart>
attributes?: DoctypeAttribute[]
end: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeEnd>
}
interface Text {
value: Tokenizer.Token<Tokenizer.TokenTypes.Text>
}
interface Tag {
name: string
selfClosing: boolean
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStart>
attributes?: TagAttribute[]
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEnd>
children?: AnyNode[]
close?: Tokenizer.Token<Tokenizer.TokenTypes.CloseTag>
}
interface Comment {
start: Tokenizer.Token<Tokenizer.TokenTypes.CommentStart>
value: Tokenizer.Token<Tokenizer.TokenTypes.CommentContent>
end: Tokenizer.Token<Tokenizer.TokenTypes.CommentEnd>
}
interface Script {
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartScript>
attributes?: TagAttribute[]
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndScript>
value: Tokenizer.Token<Tokenizer.TokenTypes.ScriptTagContent>
close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagScript>
}
interface Style {
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartStyle>,
attributes?: TagAttribute[],
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndStyle>,
value: Tokenizer.Token<Tokenizer.TokenTypes.StyleTagContent>,
close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagStyle>
}
type AnyNodeContent =
| Document
| Doctype
| Text
| Tag
| Comment
| Script
| Style
}
interface State {
caretPosition: number
currentContext: ContextTypes.AnyContextType
currentNode: NodeTypes.AnyNodeType
rootNode: NodeTypes.Document
}
interface Result {
state: State
ast: AST
}
type AST = DocumentNode
interface Node<T extends NodeTypes.AnyNodeType, C extends NodeContents.AnyNodeContent> {
nodeType: T
content: C
}
type AnyNode = Node<NodeTypes.AnyNodeType, NodeContents.AnyNodeContent>
type DocumentNode = Node<NodeTypes.Document, NodeContents.Document>
type DoctypeNode = Node<NodeTypes.Doctype, NodeContents.Doctype>
type TextNode = Node<NodeTypes.Text, NodeContents.Text>
type TagNode = Node<NodeTypes.Tag, NodeContents.Tag>
type CommentNode = Node<NodeTypes.Comment, NodeContents.Comment>
type ScriptNode = Node<NodeTypes.Script, NodeContents.Script>
type StyleNode = Node<NodeTypes.Style, NodeContents.Style>
interface DoctypeAttribute {
startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperStart>,
value: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttribute>,
endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperEnd>
}
interface TagAttribute {
key?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeKey>,
startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperStart>,
value?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValue>,
endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperEnd>
}
}

View File

@@ -1,298 +0,0 @@
import { Transform } from 'stream'
declare function tokenize(
html: string,
existingState?: Tokenizer.State,
options?: Tokenizer.Options
): Tokenizer.Result
declare function constructTree(
tokens: Tokenizer.AnyToken[],
existingState?: TreeConstructor.State
): TreeConstructor.Result
declare class StreamTokenizer extends Transform {}
declare class StreamTreeConstructor extends Transform {}
export namespace Tokenizer {
namespace ContextTypes {
type Data = 'tokenizer-context:data'
type OpenTagStart = 'tokenizer-context:open-tag-start'
type CloseTag = 'tokenizer-context:close-tag'
type Attributes = 'tokenizer-context:attributes'
type OpenTagEnd = 'tokenizer-context:open-tag-end'
type AttributeKey = 'tokenizer-context:attribute-key'
type AttributeValue = 'tokenizer-context:attribute-value'
type AttributeValueBare = 'tokenizer-context:attribute-value-bare'
type AttributeValueWrapped = 'tokenizer-context:attribute-value-wrapped'
type ScriptContent = 'tokenizer-context:script-content'
type StyleContent = 'tokenizer-context:style-content'
type DoctypeStart = 'tokenizer-context:doctype-start'
type DoctypeEnd = 'tokenizer-context:doctype-end'
type DoctypeAttributes = 'tokenizer-context:doctype-attributes'
type DoctypeAttributeWrapped = 'tokenizer-context:doctype-attribute-wrapped'
type DoctypeAttributeBare = 'tokenizer-context:doctype-attribute-bare'
type CommentStart = 'tokenizer-context:comment-start'
type CommentContent = 'tokenizer-context:comment-content'
type CommentEnd = 'tokenizer-context:comment-end'
type AnyContextType =
| Data
| OpenTagStart
| CloseTag
| Attributes
| OpenTagEnd
| AttributeKey
| AttributeValue
| AttributeValueBare
| AttributeValueWrapped
| ScriptContent
| StyleContent
| DoctypeStart
| DoctypeEnd
| DoctypeAttributes
| DoctypeAttributeWrapped
| DoctypeAttributeBare
| CommentStart
| CommentContent
| CommentEnd
}
namespace TokenTypes {
type Text = 'token:text'
type OpenTagStart = 'token:open-tag-start'
type AttributeKey = 'token:attribute-key'
type AttributeAssigment = 'token:attribute-assignment'
type AttributeValueWrapperStart = 'token:attribute-value-wrapper-start'
type AttributeValue = 'token:attribute-value'
type AttributeValueWrapperEnd = 'token:attribute-value-wrapper-end'
type OpenTagEnd = 'token:open-tag-end'
type CloseTag = 'token:close-tag'
type OpenTagStartScript = 'token:open-tag-start-script'
type ScriptTagContent = 'token:script-tag-content'
type OpenTagEndScript = 'token:open-tag-end-script'
type CloseTagScript = 'token:close-tag-script'
type OpenTagStartStyle = 'token:open-tag-start-style'
type StyleTagContent = 'token:style-tag-content'
type OpenTagEndStyle = 'token:open-tag-end-style'
type CloseTagStyle = 'token:close-tag-style'
type DoctypeStart = 'token:doctype-start'
type DoctypeEnd = 'token:doctype-end'
type DoctypeAttributeWrapperStart = 'token:doctype-attribute-wrapper-start'
type DoctypeAttribute = 'token:doctype-attribute'
type DoctypeAttributeWrapperEnd = 'token:doctype-attribute-wrapper-end'
type CommentStart = 'token:comment-start'
type CommentContent = 'token:comment-content'
type CommentEnd = 'token:comment-end'
type AnyTokenType =
| Text
| OpenTagStart
| AttributeKey
| AttributeAssigment
| AttributeValueWrapperStart
| AttributeValue
| AttributeValueWrapperEnd
| OpenTagEnd
| CloseTag
| OpenTagStartScript
| ScriptTagContent
| OpenTagEndScript
| CloseTagScript
| OpenTagStartStyle
| StyleTagContent
| OpenTagEndStyle
| CloseTagStyle
| DoctypeStart
| DoctypeEnd
| DoctypeAttributeWrapperStart
| DoctypeAttribute
| DoctypeAttributeWrapperEnd
| CommentStart
| CommentContent
| CommentEnd
}
interface Options {
isFinalChunk: boolean
}
interface State {
currentContext: string
contextParams: ContextParams
decisionBuffer: string
accumulatedContent: string
caretPosition: number
}
interface Result {
state: State
tokens: AnyToken[]
}
type AnyToken = Token<TokenTypes.AnyTokenType>
interface Token<T extends TokenTypes.AnyTokenType> {
type: T
content: string
startPosition: number
endPosition: number
}
type ContextParams = {
[C in ContextTypes.AnyContextType]?: {
wrapper?: '"' | '\'',
tagName?: string
}
}
}
export namespace TreeConstructor {
namespace NodeTypes {
type Document = 'document'
type Doctype = 'doctype'
type Tag = 'tag'
type Text = 'text'
type Comment = 'comment'
type Script = 'script'
type Style = 'style'
type AnyNodeType =
| Document
| Doctype
| Tag
| Text
| Comment
| Script
| Style
}
namespace ContextTypes {
type TagContent = 'tree-constructor-context:tag-content'
type Tag = 'tree-constructor-context:tag'
type TagName = 'tree-constructor-context:tag-name'
type Attributes = 'tree-constructor-context:attributes'
type Attribute = 'tree-constructor-context:attribute'
type AttributeValue = 'tree-constructor-context:attribute-value'
type Comment = 'tree-constructor-context:comment'
type Doctype = 'tree-constructor-context:doctype'
type DoctypeAttributes = 'tree-constructor-context:doctype-attributes'
type DoctypeAttribute = 'tree-constructor-context:doctype-attribute'
type ScriptTag = 'tree-constructor-context:script-tag'
type StyleTag = 'tree-constructor-context:style-tag'
type AnyContextType =
| TagContent
| Tag
| TagName
| Attributes
| Attribute
| AttributeValue
| Comment
| Doctype
| DoctypeAttributes
| DoctypeAttribute
| ScriptTag
| StyleTag
}
namespace NodeContents {
interface Document {
children: AnyNode[]
}
interface Doctype {
start: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeStart>
attributes?: DoctypeAttribute[]
end: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeEnd>
}
interface Text {
value: Tokenizer.Token<Tokenizer.TokenTypes.Text>
}
interface Tag {
name: string
selfClosing: boolean
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStart>
attributes?: TagAttribute[]
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEnd>
children?: AnyNode[]
close?: Tokenizer.Token<Tokenizer.TokenTypes.CloseTag>
}
interface Comment {
start: Tokenizer.Token<Tokenizer.TokenTypes.CommentStart>
value: Tokenizer.Token<Tokenizer.TokenTypes.CommentContent>
end: Tokenizer.Token<Tokenizer.TokenTypes.CommentEnd>
}
interface Script {
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartScript>
attributes?: TagAttribute[]
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndScript>
value: Tokenizer.Token<Tokenizer.TokenTypes.ScriptTagContent>
close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagScript>
}
interface Style {
openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartStyle>,
attributes?: TagAttribute[],
openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndStyle>,
value: Tokenizer.Token<Tokenizer.TokenTypes.StyleTagContent>,
close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagStyle>
}
type AnyNodeContent =
| Document
| Doctype
| Text
| Tag
| Comment
| Script
| Style
}
interface State {
caretPosition: number
currentContext: ContextTypes.AnyContextType
currentNode: NodeTypes.AnyNodeType
rootNode: NodeTypes.Document
}
interface Result {
state: State
ast: AST
}
type AST = DocumentNode
interface Node<T extends NodeTypes.AnyNodeType, C extends NodeContents.AnyNodeContent> {
nodeType: T
content: C
}
type AnyNode = Node<NodeTypes.AnyNodeType, NodeContents.AnyNodeContent>
type DocumentNode = Node<NodeTypes.Document, NodeContents.Document>
type DoctypeNode = Node<NodeTypes.Doctype, NodeContents.Doctype>
type TextNode = Node<NodeTypes.Text, NodeContents.Text>
type TagNode = Node<NodeTypes.Tag, NodeContents.Tag>
type CommentNode = Node<NodeTypes.Comment, NodeContents.Comment>
type ScriptNode = Node<NodeTypes.Script, NodeContents.Script>
type StyleNode = Node<NodeTypes.Style, NodeContents.Style>
interface DoctypeAttribute {
startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperStart>,
value: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttribute>,
endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperEnd>
}
interface TagAttribute {
key?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeKey>,
startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperStart>,
value?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValue>,
endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperEnd>
}
}

View File

@@ -1,62 +0,0 @@
{
"name": "hyntax",
"version": "1.1.7",
"description": "Straightforward HTML parser for Node.js and browser",
"keywords": [
"html",
"parser",
"html5",
"html5 parser",
"htmlparser",
"html parser",
"html tree-constructor",
"html to JSON",
"html to AST",
"html tokenizer",
"tokenize",
"tokenizer",
"stream parsing",
"stream parser",
"typescript",
"types",
"node.js",
"node.js html parser"
],
"repository": {
"type": "git",
"url": "git@github.com:mykolaharmash/hyntax.git"
},
"homepage": "https://github.com/mykolaharmash/hyntax",
"bugs": "https://github.com/mykolaharmash/hyntax/issues",
"main": "./index.js",
"scripts": {
"test": "tape ${TEST:-'./tests/**/*.test.js'} | tap-spec",
"coverage": "nyc -x 'tests/**/*' npm test",
"generate-readme-toc": "./generate-toc.js",
"prepublishOnly": "babel index.js --out-file index.es5.js && babel lib --out-dir lib-es5"
},
"author": {
"name": "Mykola Harmash",
"email": "mykola.harmash@gmail.com",
"url": "https://stse.io"
},
"license": "MIT",
"private": false,
"engines": {
"node": ">=6.11.1",
"npm": ">=5.3.0"
},
"devDependencies": {
"@babel/cli": "^7.5.5",
"@babel/core": "^7.5.5",
"@babel/preset-env": "^7.5.5",
"coveralls": "^3.0.5",
"deep-diff": "^0.3.8",
"eslint": "^6.3.0",
"nyc": "^15.1.0",
"remark": "^10.0.1",
"remark-toc": "^5.1.1",
"tap-spec": "^5.0.0",
"tape": "^4.11.0"
}
}

View File

@@ -1,62 +0,0 @@
{
"name": "hyntax",
"version": "1.1.7",
"description": "Straightforward HTML parser for Node.js and browser",
"keywords": [
"html",
"parser",
"html5",
"html5 parser",
"htmlparser",
"html parser",
"html tree-constructor",
"html to JSON",
"html to AST",
"html tokenizer",
"tokenize",
"tokenizer",
"stream parsing",
"stream parser",
"typescript",
"types",
"node.js",
"node.js html parser"
],
"repository": {
"type": "git",
"url": "git@github.com:mykolaharmash/hyntax.git"
},
"homepage": "https://github.com/mykolaharmash/hyntax",
"bugs": "https://github.com/mykolaharmash/hyntax/issues",
"main": "./index.js",
"scripts": {
"test": "tape ${TEST:-'./tests/**/*.test.js'} | tap-spec",
"coverage": "nyc -x 'tests/**/*' npm test",
"generate-readme-toc": "./generate-toc.js",
"prepublishOnly": "babel index.js --out-file index.es5.js && babel lib --out-dir lib-es5"
},
"author": {
"name": "Mykola Harmash",
"email": "mykola.harmash@gmail.com",
"url": "https://stse.io"
},
"license": "MIT",
"private": false,
"engines": {
"node": ">=6.11.1",
"npm": ">=5.3.0"
},
"devDependencies": {
"@babel/cli": "^7.11.6",
"@babel/core": "^7.11.6",
"@babel/preset-env": "^7.11.5",
"coveralls": "^3.0.5",
"deep-diff": "^0.3.8",
"eslint": "^6.3.0",
"nyc": "^15.1.0",
"remark": "^10.0.1",
"remark-toc": "^5.1.1",
"tap-spec": "^5.0.0",
"tape": "^4.11.0"
}
}

View File

@@ -1,62 +0,0 @@
{
"name": "hyntax",
"version": "1.1.5",
"description": "Straightforward HTML parser for Node.js and browser",
"keywords": [
"html",
"parser",
"html5",
"html5 parser",
"htmlparser",
"html parser",
"html tree-constructor",
"html to JSON",
"html to AST",
"html tokenizer",
"tokenize",
"tokenizer",
"stream parsing",
"stream parser",
"typescript",
"types",
"node.js",
"node.js html parser"
],
"repository": {
"type": "git",
"url": "git@github.com:nik-garmash/hyntax.git"
},
"homepage": "https://github.com/nik-garmash/hyntax",
"bugs": "https://github.com/nik-garmash/hyntax/issues",
"main": "./index.js",
"scripts": {
"test": "tape ${TEST:-'./tests/**/*.test.js'} | tap-spec",
"coverage": "nyc -x 'tests/**/*' npm test",
"generate-readme-toc": "./generate-toc.js",
"prepublishOnly": "babel index.js --out-file index.es5.js && babel lib --out-dir lib-es5"
},
"author": {
"name": "Nikolay Garmash",
"email": "garmash.nikolay@gmail.com",
"url": "https://nikgarmash.com"
},
"license": "MIT",
"private": false,
"engines": {
"node": ">=6.11.1",
"npm": ">=5.3.0"
},
"devDependencies": {
"@babel/cli": "^7.5.5",
"@babel/core": "^7.5.5",
"@babel/preset-env": "^7.5.5",
"coveralls": "^3.0.5",
"deep-diff": "^0.3.8",
"eslint": "^6.3.0",
"nyc": "^14.1.1",
"remark": "^10.0.1",
"remark-toc": "^5.1.1",
"tap-spec": "^5.0.0",
"tape": "^4.11.0"
}
}

View File

@@ -1,62 +0,0 @@
{
"name": "hyntax",
"version": "1.1.5",
"description": "Straightforward HTML parser for Node.js and browser",
"keywords": [
"html",
"parser",
"html5",
"html5 parser",
"htmlparser",
"html parser",
"html tree-constructor",
"html to JSON",
"html to AST",
"html tokenizer",
"tokenize",
"tokenizer",
"stream parsing",
"stream parser",
"typescript",
"types",
"node.js",
"node.js html parser"
],
"repository": {
"type": "git",
"url": "git@github.com:nik-garmash/hyntax.git"
},
"homepage": "https://github.com/nik-garmash/hyntax",
"bugs": "https://github.com/nik-garmash/hyntax/issues",
"main": "./index.js",
"scripts": {
"test": "tape ${TEST:-'./tests/**/*.test.js'} | tap-spec",
"coverage": "nyc -x 'tests/**/*' npm test",
"generate-readme-toc": "./generate-toc.js",
"prepublishOnly": "babel index.js --out-file index.es5.js && babel lib --out-dir lib-es5"
},
"author": {
"name": "Nikolay Garmash",
"email": "garmash.nikolay@gmail.com",
"url": "https://nikgarmash.com"
},
"license": "MIT",
"private": false,
"engines": {
"node": ">=6.11.1",
"npm": ">=5.3.0"
},
"devDependencies": {
"@babel/cli": "^7.5.5",
"@babel/core": "^7.5.5",
"@babel/preset-env": "^7.5.5",
"coveralls": "^3.0.5",
"deep-diff": "^0.3.8",
"eslint": "^6.3.0",
"nyc": "^14.1.1",
"remark": "^10.0.1",
"remark-toc": "^5.1.1",
"tap-spec": "^5.0.0",
"tape": "^4.11.0"
}
}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.