comment-patterns

A list of comment-patterns for different languages

This module contains an extract of the language-database of groc with information about how single- and multi-line comments are written in different languages.

Basic usage

var commentPattern = require('comment-patterns');
var p = commentPattern('filename.js');

This will lead to p being:

{
  name: "JavaScript",
  nameMatchers: [".js"],
  multiLineComment: [{
    start: /\/\*\*/,
    middle: "*",
    end: "*/",
    apidoc: true
  }, {
    start: /\/\*/,
    middle: "*",
    end: "*/"
  }],
  singleLineComment: [{
    start: "//"
  }]
}

name is the name of the language
nameMatchers is an array of file extensions of filenames that files in this language usually have.
multiLineComment is an array of patterns for comments that may span multiple lines
- start is the beginning of a comment
- middle is a character of a regex that occurs in front of each comment line
- end marks the end of the comment
singleLineComment is the prefix of comments that go until the end of the line

Variation (regex)

It is also possible to retrieve a regular expression that matches comments (up to the next line of code):

var re = commentPattern.regex('filename.js');

The result re will be:

{
  regex: /^([ \t]*)(\/\*\*([\s\S]*?)\*\/|\/\*([\s\S]*?)\*\/|((?:[ \t]*?\/\/.*\r?\n?)+))[\r\n]*/gm,
  cg: {
    indent: 1,
    wholeComment: 2,
    contentStart: 3
  },
  middle: [/^[ \t]*\*/gm, /^[ \t]*\*/gm, /^[ \t]*\/\//gm],
  name: "JavaScript",
  info: [{
    type: "multiline",
    apidoc: true
  }, {
    type: "multiline"
  }, {
    type: "singleline"
  }]
}

regex is the actual regular expression. It matches the comments in a string, including any empty lines after the comment.
cg are constant values refering to capturing groups of the regex.
- match[cg.indent] contains the spaces that indent comment-start-delimiter.
- match[cg.wholeComment] matches the comment including delimiters.
- match[cg.contentStart] is the first group that captures the contents of the comment In this case, there are multiple possible delimiters, so dependending on which delimiter is used, match[cg.contentStart] or match[cg.contentStart + 1] is filled. the others are undefined.
middle contains one pattern for each group after cg.contentStart that matches the prefix used before comment lines. It can be used to remove this prefix. If the middle-prefix for this capturing group is empty (''), the pattern is null.
info contains additional information for each group after cg.contentStart, currently this information is only { apidocs: true } if the group is matching an apidoc comment.
name is the language name for debugging purposes.

Variation (codeContext)

For API-documentation, it is important to determine the context of the comment (i.e. the thing that the comment is documenting). Although this does not strictly belong to the comment itself, this library also has methods to determine the code-context of a comment These are functions that return a json by matching a single-line of code against a regular expression.

var detector = commentPattern.codeContext("filename.js");
var cc = detector("function abc(param1,param2) {",2);

The result in cc will be

{
  begin: 2,
  type: "function statement",
  name: "abc",
  params: ["param1", "param2"],
  string: "abc()",
  original: "function abc(param1,param2) {"
}

This result (for 'JavaScript' is actuall taken from the parse-code-context module by Jon Schlinkert. The method codeContext returns a Detector

API-Reference

commentPattern

Load the comment-pattern for a given file. The file-language is determined by the file-extension.

Params

filename {string}: the name of the file
returns {object}: the comment-patterns

.commentPattern.regex

Load the comment-regex for a given file. The result contains a regex that matches the comments in the specification. It also has information about which the different capturing groups of an object.

Params

filename {string}: the name of the file
returns {object}: an object containing regular expressions and capturing-group metadata, see usage example for details

The code-context detector

Detector

Create a new detector. A detector contains a list of parsers which extract the code context from a list of nodes. It is an immutable object that can be extended, creating a new instance with more parsers.

Params

{function(string)}: parsers

.extend

Creates an extended Detector with additional parsers. A new instance will be created. The old Detector remains untouched.

Params

{function(string)}: moreParsers more parsers. Those are inserted at the beginning of the list, so they override existing parsers.
returns {Detector}: a new Detector instance

.detect

Perform detection. This method calls the included parsers one after another and returns the first-non-null result. The line-number is returned as begin-property in the result, but the parser-function can override it.

Params

{string}: string the line-of-code
{number}: lineNr the line-number
returns {object}

.parser

Helper function to create a parser from a regex that matches a string and a resolver that parses the

Params

{RegExp}: regex a regular expression that is matched against a code-line.
{function(...string)}: resolver a function that resolves the regex match into a code-context object. The function-parameters are the capturing groups of the regex
returns {function}: a function that can be used as parser

The database

The language-specification can be found in the languages-directory. There is one file for each language. The actual databases will be created from these files on prepublish.

The content of language database can be found here

Contributing

See the contributing guide

Run tests

Install dev dependencies:

$ npm i -d && npm test

extract-comments: Uses esprima to extract line and block comments from a string of JavaScript. Also optionally… more | homepage

Author

Nils Knappmeier

License

Released under the MIT license.

Change Log

This project adheres to Semantic Versioning.

v0.10.1 - 2018-10-04

Chore

Update dependency versions (Removes security vulnerability from lodash@3)

v0.10.0 - 2018-10-04

Add

Multiline comment support for PHP (Sean Snyder)

v0.9.0 - 2016-08-22

Add

Multiline comment support for Ruby and Python

v0.8.1 - 2015-11-08

Fix

Apply JS-Standard Coding-Style
Change method to detect whether standardjs or Thought are installed.

v0.8.0 - 2015-07-20

Add

Code-Context-Functions for Handlebars

v0.7.0 - 2015-06-14

Fix

Fix code-context detection for object properties that are functions. (i.e. "key: function(a,b) {")

Add

C-style multiline-comments splitted into multiple regexes (/**, and /*). /** is marked as "used for apidocs"
Add info property to the output of .regex, which contains additional information (so far only the apidoc: true property.

Breaking changes

.singleLineComment is no longer and array of strings (['#'] but an array of objects ([ { start: '#' } ]) in order to allow them to be marked as "used for apidocs". Definitions can be mixed in the languages-source-files, but will always appear in the second form in the compiled database.

v0.6.0 - 2015-06-08

Add

Add function .codeContext to return code-context parser for different languages.

v0.5.2 - 2015-05-27

Add

Multi-line-comments for Less

v0.5.1 - 2015-05-24

Add

Add .bash as file-extension for shell-scripts.

v0.5.0 - 2015-05-20

Fix

Remove "function()" from generated database files. This lead to an error in the test cases.

v0.4.0 - 2015-05-20

Fix

Consecutive indented lines of single-line-comments are now recognized as a single comment by regexes -

v0.3.0 - 2015-05-15

Changed

Next-line-of-code is not matched anymode. The client has to extract the code itself. The line-number is still computed.

v0.2.0 - 2015-05-14

Changed

Some comment delimiters are now regexes rather than strings
Middle-prefix for C-like languages now allows a preceeding whitespace.

Added

Method .regex creates regular expressions that match comments

v0.1.0 - 2015-05-04

Changed

Only strings are used to determine that language from the file extension /php\d?/ is replaced by php,php3,...

Package detail

comment-patterns

Basic usage

Variation (regex)

Variation (codeContext)

API-Reference

The code-context detector

The database

Contributing

Run tests

Related

Author

License

Change Log

v0.10.1 - 2018-10-04

Chore

v0.10.0 - 2018-10-04

Add

v0.9.0 - 2016-08-22

Add

v0.8.1 - 2015-11-08

Fix

v0.8.0 - 2015-07-20

Add

v0.7.0 - 2015-06-14

Fix

Add

Breaking changes

v0.6.0 - 2015-06-08

Add

v0.5.2 - 2015-05-27

Add

v0.5.1 - 2015-05-24

Add

v0.5.0 - 2015-05-20

Fix

v0.4.0 - 2015-05-20

Fix

v0.3.0 - 2015-05-15

Changed

v0.2.0 - 2015-05-14

Changed

Added

v0.1.0 - 2015-05-04

Changed

v0.0.x

Initial version