A common enough task, this is a way of doing it using .NET regular expressions.
Hi,"How, are you","I'm good thanks, you?" Hi again, How, are you now,"I'm still good thanks"
You can see that there is text in and out of double quotes and commas inside quotes, all these situations need to be coped with.
("(?<target>[^"]*)"|(?<target>[^",]+))(,\s*|(?<line>\r?\n|$))
"(?<target>[^"]*)" matches any quoted items and puts the result in the named group 'target'
(?<target>[^",]+) matches non quoted items
,\s* matches commas
\r?\n|$ matches end of lines and end of files
Using the regex tester you can see the results as required found in the 'target' group and end of lines and the end of the file is indicated by something in the 'line' group
public static string[][] ParseCsv(this string csvText) {
var csvRegex = new Regex(
@"(""(?<target>[^""]*)""|(?<target>[^"",]+))(,\s*|(?<line>\r?\n|$))");
var lines = new List<string[]>();
var line = new List<string>();
foreach (var match in csvRegex.Matches(csvText).Cast<Match>()) {
line.Add(match.Groups["target"].Value);
if (!match.Groups["line"].Success) continue;
// end of line or file found
lines.Add(line.ToArray());
if (match.Groups["line"].Length > 0) {
// end of line
line = new List<string>();
}
}
return lines.ToArray();
}
Some thing to note: I suspect this will not be very efficient for large amounts of data as it takes a string as its input, for a large file I'd use a stream as the input and therefore a different strategy to regular expressions
| < | February 2012 | |||||
|---|---|---|---|---|---|---|
| S | M | T | W | T | F | S |
| 29 | 30 | 31 | 1 | 2 | 3 | 4 |
| 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 | 16 | 17 | 18 |
| 19 | 20 | 21 | 22 | 23 | 24 | 25 |
| 26 | 27 | 28 | 29 | 1 | 2 | 3 |
Add-ins AJAX ASP.NET MVC Browsers C# Caching Compression CORS CSS CV Data Database DependencyResolver Development Entity Framework Error Handling File Upload Forms GDI+ HTML HTML Editor HTTP Interfaces JavaScript JQuery MCE MetadataProvider MSBuild Numbers Objects Patterns Progressive Enhancement Projects Publish Regex Resources Security SEO SMTP Source Control Strings Sub-Collections TDD Tools Twitter User Interface WCF Web Development WHS WMC XLinq XML
11 hours ago
verge
Microsoft teases Windows 8 'Consumer Preview' with Bing betta fish site http://t.co/lcJICazH