A common enough task, this is a way of doing it using .NET regular expressions.
Hi,"How, are you","I'm good thanks, you?" Hi again, How, are you now,"I'm still good thanks"
You can see that there is text in and out of double quotes and commas inside quotes, all these situations need to be coped with.
("(?<target>[^"]*)"|(?<target>[^",]+))(,\s*|(?<line>\r?\n|$))
"(?<target>[^"]*)" matches any quoted items and puts the result in the named group 'target'
(?<target>[^",]+) matches non quoted items
,\s* matches commas
\r?\n|$ matches end of lines and end of files
Using the regex tester you can see the results as required found in the 'target' group and end of lines and the end of the file is indicated by something in the 'line' group
public static string[][] ParseCsv(this string csvText) {
var csvRegex = new Regex(
@"(""(?<target>[^""]*)""|(?<target>[^"",]+))(,\s*|(?<line>\r?\n|$))");
var lines = new List<string[]>();
var line = new List<string>();
foreach (var match in csvRegex.Matches(csvText).Cast<Match>()) {
line.Add(match.Groups["target"].Value);
if (!match.Groups["line"].Success) continue;
// end of line or file found
lines.Add(line.ToArray());
if (match.Groups["line"].Length > 0) {
// end of line
line = new List<string>();
}
}
return lines.ToArray();
}
Some thing to note: I suspect this will not be very efficient for large amounts of data as it takes a string as its input, for a large file I'd use a stream as the input and therefore a different strategy to regular expressions
| < | May 2012 | |||||
|---|---|---|---|---|---|---|
| S | M | T | W | T | F | S |
| 29 | 30 | 1 | 2 | 3 | 4 | 5 |
| 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| 13 | 14 | 15 | 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 | 24 | 25 | 26 |
| 27 | 28 | 29 | 30 | 31 | 1 | 2 |
Add-ins AJAX ASP.NET MVC Browsers C# Caching CodeDom Compression CORS CSS CV Data Database DependencyResolver Development Dynamic Entity Framework Error Handling Extend File Upload Forms GDI+ HTML HTML Editor HTTP Interfaces JavaScript JQuery MCE MetadataProvider MSBuild Numbers Objects Patterns Progressive Enhancement Projects Publish Regex Resources Security SEO SMTP Source Control Strings Sub-Collections TDD Templates Tools Twitter User Interface WCF Web Development WHS WMC XLinq XML
1 hours ago
TheNextWeb
Bing's search API now live on the Windows Azure Marketplace http://t.co/utX8uOuG by @alex
15/05/2012
WindowsAzure
Announcing the MEET Windows Azure Event! Streamed online June 7th. Register at http://t.co/bObzTAuL #MEETAzure #WindowsAzure
One hour ago
commadelimited
Buy the @amazon Kindle version of mine and @cfjedimaster's @jquerymobile book for $10 today: http://t.co/PWRZ2dkd
just now
CSSDropDownMenu
Simple horizontal css drop down menu demo Windows Azure and Cloud Computing Posts for 4/16/2011+ This makes fo... http://t.co/DZdNLHxF
just now
WAPForums
UpdateMessage() method not available in SDK 1.6? http://t.co/fyORSB1T Windows #Azure