JS正则表达式（未完）

什么是正则表达式？

简单来说和windows自带的搜索功能有点像，主要用于搜索字符串中某些特定的字符

测试工具

推荐三个测试正则表达式的方法，一个是去开源中国在线测试，第二个是安装一个谷歌浏览器应用RegExp Tester来调试，第三个可以在regex上进行测试

JavaScript模式匹配

JavaScript定义了RegExp()函数，用来创建表示文本匹配模式的对象。RegExp()对象定义了很多有用的方法，字符串同样具有可以接收RegExp参数的方法，如：

var text = 'testing: 1, 2, 3';
var pattern = /\d+/g; // 一个或多个数字
pattern.test(test) // true： 匹配成功
text.search(pattern) // 9：首次匹配成功的位置
text.match(pattern) // ['1', '2', '3']：所有匹配组成的数组
text.replace(pattern, '#'); // 'testing: #, #, #';
text.split(/D+/); // ['', '1', '2', '3']：用非数字字符截取字符串

基本匹配

简单入门

const reg = /the/g
const str = `The fat cat sat on the mat.`
let m
while((m = reg.exec(str)) !== null) {
    if (m.index === reg.lastIndex) {
        reg.lastIndex++;
    }
    console.log(m)
    m.forEach((match, index) => {
        console.log(`match is: ${match}, index is ${index}`)
    })
}

m: {
    0: 'the',
    index: 19,
    input: 'The fat cat sat on the mat.',
    length: 1
}

元字符

正则表达式十分依赖元字符，以下是一些元字符的介绍：

元字符	描述
.	任意单个字符除了换行符
[ ]	匹配方括号内的任意字符
[^]	匹配除了方括号内的任意字符
*	匹配至少0个在*之前重复的内容
+	匹配至少1个在+之前的内容
?	标记?之前的字符为可选
{n,m}	匹配n到m个括号之前的字符
(xyz)	匹配与xyz完全相等的字符串
\	匹配符号前或后的运算符
\	转义字符，匹配一些保留的字符 { }
^	从开始进行匹配
$	从末端进行匹配

点运算符

".ar" => The car parked in the garage.

m[0]: {
    0: 'fat',
    index: 4,
    input: 'The fat cat sat on the mat.',
    length: 1
}
m[1]: {
    0: 'cat',
    index: 8,
    input: 'The fat cat sat on the mat.',
    length: 1
}
m[2]: {
    0: 'sat',
    index: 12,
    input: 'The fat cat sat on the mat.',
    length: 1
}
m[3]: {
    0: 'mat',
    index: 23,
    input: 'The fat cat sat on the mat.',
    length: 1
}

字符集

1
2
3

"[Tt]he" => The car parked in the garage.

The the都会被匹配

方括号里的.就表示.

1
2
3

"ar[.]" => A garage is a good place to park a car.

匹配到ar.

否定字符集

1
2
3

"[^c]ar" => The car parked in the garage.

匹配到所有ar前面不以c开头的元素，即par、gar

重复次数

`*`号

*匹配在*之前的字符出现了0次及以上

1
2
3

"[a-z]*" => The car parked in the garage #21.

匹配所以小写字母

.*搭配匹配所有字符

1
2
3

"\s*cat\s*" => The fat cat sat on the concatenation.

匹配cat前后有0个或更多的空格，结果为 cat concatenation

+号

eg:

1
2
3

"c.+t" => The fat cat sat on the mat.

匹配到cat sat on the mat

?号

标记?前的字符为可选，即出现过0次或1次，等同于{0, 1}。

[T]?he匹配he和The

1
2
3

"[T]?he" => The car is parked in the garage.

匹配到The和he

{}号

{}表示前面字符可以重复出现的次数

1
2
3

"[0-9]{2,3}" => The number was 9.9997 but we rounded it off to 10.0.

9.9997中的999和10.0中的10

我们可以省略第2个参数，如[0-9]{2,}匹配至少2位数字

[0-9]{3}匹配3位数字

1
2
3

"[0-9]{2,}" => The number was 9.9997 but we rounded it off to 10.0.

匹配到9997和10

(…)特征标群

特征群是一组写在(…)中的子模式。如：(ab)*匹配连续出现0或更多个ab

我们可以在()中用|表示或

1
2
3

"(c|g|p)ar" => The car is parked in the garage.

匹配到car par gar

| 运算符

例如(T|t)he|car可以匹配(T|t)he或car

转码字符

1
2
3

"(f|c|m)at\.?" => The fat cat sat on the mat.

匹配到fat cat mat.

锚点

^号

^用来检查匹配的字符串是否在匹配字符串的开头，例如^(T|t)he 匹配以 The 或 the 开头的字符串

(T|t)he" => The car is parked in the garage.

The the

"^(T|t)he" => The car is parked in the garage.

The(开头的the)

$号

$用来匹配字符是否是最后一个

"(at\.)" => The fat cat. sat. on the mat.

at. at. at.

"(at\.)$" => The fat cat. sat. on the mat.
最后一个at.

简写字符集

简写	描述
.	除换行符外所有字符
\w	匹配所有数字字母，等同于[a-zA-Z0-9]
\W	匹配所有非字母数字，即符号，等同于[^\w]
\d	匹配所有数字，即[0-9]
\D	匹配非数字，[^\d]
\s	匹配所有空格字符，[\t\n\f\r\p{Z}]
\S	匹配所有非空格字符， [^\s]

前后关联约束

前置约束和后置约束都属于非捕获簇（用于匹配不在匹配列表里的格式）

如果我们想要获得所有跟在$后的数字，我们可以使用正向向后约束(?<=\$)[0-9\.]*，可以匹配到$开头，后面跟着0，1，2，3，4，5，6，7，8，9，.的字符

符号	描述
?=	前置约束 - 存在
?!	前置约束 - 排除
?<=	后置约束 - 存在
?<!	后置约束 - 排除

前置约束存在

前置约束表示限制条件表达式必须跟在?=定义的表达式之后

1
2
3

"[T|t]he(?=\sfat)" => The fat cat sat on the mat.

匹配The | the 后面紧跟空格和fat

前置约束排除

1
2
3

"[T|t]he(?!\sfat)" => The fat cat sat on the mat.

匹配到mat前面的the

后置约束存在

1
2
3

"(?<=[T|t]he\s)(fat|mat)" => The fat cat sat on the mat.

匹配到fat和mat

后置约束排除

1
2
3

"(?<![T|t]he\s)(cat)" => The cat sat on cat.

匹配到最后一个cat

标志

标志	描述
i	忽略大小写
g	全局搜索
m	多行搜索

忽略大小写

"The" => The fat cat sat on the mat.

只能匹配到The

"/The/gi" => The fat cat sat on the mat.

匹配到The和the

多行修饰符

"/.at(.)?$/gm" => The fat
                  cat sat
                  on the mat.

可以匹配到fat sat mat

常用正则

正整数：/\d+/
负整数：/-\d+/
电话国家号(0086)：/+?[\d\s]{3,}/
整数：/-?\d+/
用户名：/[\w\d_.]{4,16}/
数字和英文字母：/[0-9a-zA-Z]*/
数字和英文字母和空格：/[0-9a-zA-Z ]/
密码，先跳过，有争议：/^(?=^.{6,}$)((?=.*[A-Za-z0-9])(?=.*[A-Z])(?=.*[a-z]))^.*$/
邮箱：/^([a-zA-Z0-9._%-]+@[a-zA-Z0-9.-]+\.[A-Za-z]{2,4})*$/
IPV4地址：/^((?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))*$/
纯小写字母：([a-z])*
纯大写字母：([A-Z])*
日期(MM/DD/YYYY)：^(0?[1-9]|1[012])[- \/.](0?[1-9]|[12][0-9]|3[01])[- \/.](19|20)?[0-9]{2}$
日期（YYYY/MM/DD）:^(19|20)?[0-9]{2}[- \/.](0?[1-9]|1[012])[- \/.](0?[1-9]|[12][0-9]|3[01])$

来源

learn regex the easy way 并加以修改