"TIMESTAMP (UTC)","LOG TYPE","DEVICE TYPE","DEVICE","MESSAGE","PARAMETERS"
"2014-08-12 17:30:34.437","Warning","DiverGate","141403G00294","Diver gate(s) did not connect since","2014-08-08 06:37:31 (UTC)"
"2014-08-12 17:30:34.577","Warning","DiverGate","141403G00120","Diver gate(s) did not connect since","2014-08-08 06:46:22 (UTC)"
"2014-08-13 06:45:18.890","Error","DiverGate","141403G00294","Was set to inactive, because it did not connect since","2014-08-08 06:37:31 (UTC)"
"2014-08-13 07:00:18.903","Error","DiverGate","141403G00120","Was set to inactive, because it did not connect since","2014-08-08 06:46:22 (UTC)"
这是我的.csv文件,我需要从文件中读取信息,但是我需要用双引号之外的逗号分隔信息,因为在其他一些文件中,我可以将逗号分为某些信息,尤其是消息,日志类型. ..
string url = @"E:\Project.csv";
Stream stream = File.Open(url, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
string[] lines = null;
using (StreamReader sr = new StreamReader(stream))
{
string str = sr.ReadToEnd();
lines = Regex.Split(str, //what expression is going here);
}
您可以尝试环视
它们不是
consume
字符串中的字符,而是仅断言amatch
是否可行.
(?<="),(?=")
这是在线演示,并在regexstorm进行了测试
模式说明很简单
(?<= look behind to see if there is:
" '"'
) end of look-behind
, ','
(?= look ahead to see if there is:
" '"'
) end of look-ahead
\"PARAMETERS\"\r\n\"2014-08-12 17:30:34.437\""
.我无法理解\ r \ n \的含义,但是如何改为\ r \ n \将所有来自\ r \ n \的文本放入新字符串中?这只是基本的CSV解析,并且已经有库在做.我建议您看一下我以前使用过的CsvHelper,而不是尝试重新发明轮子.
您可以使用Package Manager控制台并键入以下命令,将其真正轻松地包含在项目中:
安装包CsvHelper
不用推出自己的CSV解析器,而使用现有的库.有TextFieldParser
类,这是可以用Visual Basic,我想补充参考Microsoft.VisualBasic
下项目引用,那么你可以这样做:
TextFieldParser textFieldParser = new TextFieldParser(@"E:\Project.csv");
textFieldParser.TextFieldType = FieldType.Delimited;
textFieldParser.SetDelimiters(",");
while (!textFieldParser.EndOfData)
{
string[] values = textFieldParser.ReadFields();
Console.WriteLine(string.Join("---", values));//printing the row
}
textFieldParser.Close();
嘿,您也可以使用此正则表达式
var result = Regex.Split(samplestring, ",(?=(?:[^']*'[^']*')*[^']*$)");
"TIMESTAMP (UTC)","LOG TYPE","DEVICE TYPE","DEVICE","MESSAGE","PARAMETERS"
"2014-08-12 17:30:34.437","Warning","DiverGate","141403G00294","Diver gate(s) did not connect since","2014-08-08 06:37:31 (UTC)"
"2014-08-12 17:30:34.577","Warning","DiverGate","141403G00120","Diver gate(s) did not connect since","2014-08-08 06:46:22 (UTC)"
"2014-08-13 06:45:18.890","Error","DiverGate","141403G00294","Was set to inactive, because it did not connect since","2014-08-08 06:37:31 (UTC)"
"2014-08-13 07:00:18.903","Error","DiverGate","141403G00120","Was set to inactive, because it did not connect since","2014-08-08 06:46:22 (UTC)"
This is my .csv file and i need to read informations from file, but I need to split informations with comma who is outside double quotes, because in some other files I can find comma into some informations, especially in message, log type,...
string url = @"E:\Project.csv";
Stream stream = File.Open(url, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
string[] lines = null;
using (StreamReader sr = new StreamReader(stream))
{
string str = sr.ReadToEnd();
lines = Regex.Split(str, //what expression is going here);
}
You can try with Lookaround
They do not
consume
characters in the string, but only assert whether amatch
is possible or not.
(?<="),(?=")
Here is online demo and tested at regexstorm
Pattern explanation is very simple
(?<= look behind to see if there is:
" '"'
) end of look-behind
, ','
(?= look ahead to see if there is:
" '"'
) end of look-ahead
\"PARAMETERS\"\r\n\"2014-08-12 17:30:34.437\""
. I unederstand what is meaning \r\n\, but how to instead \r\n\ i put all text right from \r\n\ into new string?This is just basic CSV parsing, and there are libraries out there to do it already. I would recommend taking a look at CsvHelper which I've used before rather than trying to re-invent the wheel.
You can include this in your project really easily by using the Package Manager Console and typing:
Install-Package CsvHelper
Instead of rolling out your own CSV parser, use existing libraries. There is TextFieldParser
class which is available with Visual Basic, Just add reference to Microsoft.VisualBasic
under project references then you can do:
TextFieldParser textFieldParser = new TextFieldParser(@"E:\Project.csv");
textFieldParser.TextFieldType = FieldType.Delimited;
textFieldParser.SetDelimiters(",");
while (!textFieldParser.EndOfData)
{
string[] values = textFieldParser.ReadFields();
Console.WriteLine(string.Join("---", values));//printing the row
}
textFieldParser.Close();
Hey you can also use this regex
var result = Regex.Split(samplestring, ",(?=(?:[^']*'[^']*')*[^']*$)");
本人是.net程序员,因为英语不行,使用工具翻译,希望对有需要的人有所帮助
如果本文质量不好,还请谅解,毕竟这些操作还是比较费时的,英语较好的可以看原文