正则按照特定规则拆分聊天记录

本文介绍了一种使用PHP解析复杂格式聊天记录的方法,并展示了如何通过正则表达式匹配和过滤技术来提取聊天记录中的角色、时间戳及消息内容。

将聊天记录拆成对应的每句话的role,time,message的格式

<?php
$data = '[S][04:39:35] [#1][V][04:39:35] 不忈: Hello[V][04:39:35] 不忈: Hello[V][04:39:35] 不忈: Hello[V][04:40:11] 不忈: 你好[A][04:40:38] agent01: 嗯[S][04:40:54] [#8]给你[S][04:40:57] agent03 [#6][S][04:41:00] agent01 [#7][A][04:41:02] agent03: hade [V][04:41:06] 不忈: 一[A][04:41:10] agent03: en [S][04:41:20] [#8]-[S][04:41:22] agent01 [#6][S][04:41:24] agent03 [#7][A][04:41:27] agent01: 32423[V][04:41:29] 不忈: 。[S][04:41:31] [#2]agent01';
preg_match_all('/(\[S\](.*?))(?=\[[SVA]\])|(\[S\](.*?))$|(\[V\](.*?))(?=\[[SVA]\])|(\[V\](.*?))$|(\[A\](.*?))(?=\[[SVA]\])|(\[A\](.*?))$/',$data,$matches);
$data1 = $matches[0]; 
foreach($matches[0] as $k => $v){
	preg_match_all('/\[\w\]|\[\d{2}:\d{2}:\d{2}\]/',$v,$vMatch);
	$res[$k]['role'] = $vMatch[0][0];
	$res[$k]['time'] = $vMatch[0][1];
	$res[$k]['message'] = preg_filter('/\[(S|A|V)\]|\[\d{2}:\d{2}:\d{2}\]/','',$v);
}
print_r($res);exit;

拆分出来后:

Array
(
    [0] => Array
        (
            [role] => [S]
            [time] => [04:39:35]
            [message] =>  [#1]
        )

    [1] => Array
        (
            [role] => [V]
            [time] => [04:39:35]
            [message] =>  不忈: Hello
        )

    [2] => Array
        (
            [role] => [V]
            [time] => [04:39:35]
            [message] =>  不忈: Hello
        )

    [3] => Array
        (
            [role] => [V]
            [time] => [04:39:35]
            [message] =>  不忈: Hello
        )

    [4] => Array
        (
            [role] => [V]
            [time] => [04:40:11]
            [message] =>  不忈: 你好
        )

    [5] => Array
        (
            [role] => [A]
            [time] => [04:40:38]
            [message] =>  agent01: 嗯
        )

    [6] => Array
        (
            [role] => [S]
            [time] => [04:40:54]
            [message] =>  [#8]给你
        )

    [7] => Array
        (
            [role] => [S]
            [time] => [04:40:57]
            [message] =>  agent03 [#6]
        )

    [8] => Array
        (
            [role] => [S]
            [time] => [04:41:00]
            [message] =>  agent01 [#7]
        )

    [9] => Array
        (
            [role] => [A]
            [time] => [04:41:02]
            [message] =>  agent03: hade 
        )

    [10] => Array
        (
            [role] => [V]
            [time] => [04:41:06]
            [message] =>  不忈: 一
        )

    [11] => Array
        (
            [role] => [A]
            [time] => [04:41:10]
            [message] =>  agent03: en 
        )

    [12] => Array
        (
            [role] => [S]
            [time] => [04:41:20]
            [message] =>  [#8]-
        )

    [13] => Array
        (
            [role] => [S]
            [time] => [04:41:22]
            [message] =>  agent01 [#6]
        )

    [14] => Array
        (
            [role] => [S]
            [time] => [04:41:24]
            [message] =>  agent03 [#7]
        )

    [15] => Array
        (
            [role] => [A]
            [time] => [04:41:27]
            [message] =>  agent01: 32423
        )

    [16] => Array
        (
            [role] => [V]
            [time] => [04:41:29]
            [message] =>  不忈: 。
        )

    [17] => Array
        (
            [role] => [S]
            [time] => [04:41:31]
            [message] =>  [#2]agent01
        )

)

转载于:https://my.oschina.net/wsyblog/blog/888566

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值