How do I split a paragraph into sentences?

I am working with some paragraphs that can contain:

  • Sentences
  • Headings
  • Emails
  • Decimal numbers

I am trying to split the paragraph into sentences. So, for example with this input:

Cary Nelson and Stephen Watt. Martin Horton-Eddison. "First Class Essays" Hull, United Kingdom : Purple Peacock Press, 2012 Carol Tenopir and Donald King. "Towards Electronic Journals: Realities for Librarians and Publishers. SLA, 2000. ISBN 0-87111-507-7. "Scholarly Books" and "Peer Review" in Academic Keywords: A Devil's Dictionary for Higher Education. ISBN 0-415-92203-8. [email protected]

I am trying to get this output:

Cary Nelson and Stephen Watt.

Martin Horton-Eddison. "First Class Essays" Hull, United Kingdom : Purple Peacock Press, 2012 Carol Tenopir and Donald King.

"Towards Electronic Journals: Realities for Librarians and Publishers. SLA, 2000.

ISBN 0-87111-507-7.

"Scholarly Books" and "Peer Review" in Academic Keywords: A Devil's Dictionary for Higher Education.

ISBN 0-415-92203-8.

[email protected]

I have tried using this regular expression, but it is not matching in the way I am expecting.

String[] sentences = Regex.Split(strNew,"(?<=[.!?])\s+(?=[A-Z])");

-------------Problems Reply------------

private void parseParagraph(string input)
{
string[] lines = input.Split(new[] { ". " }, StringSplitOptions.None);
foreach(string line in lines)
{
Console.WriteLine(line.Trim());
}
}

Would be a perfect example on how to approach this.

Can´t you just split the string by the character '.'?.

string text = "Cary Nelson and Stephen Watt. Martin Horton-Eddison. "First Class Essays" Hull, United Kingdom : Purple Peacock Press, 2012 Carol Tenopir and Donald King. "Towards Electronic Journals: Realities for Librarians and Publishers. SLA, 2000. ISBN 0-87111-507-7. "Scholarly Books" and "Peer Review" in Academic Keywords: A Devil's Dictionary for Higher Education. ISBN 0-415-92203-8. [email protected]";

string[] myParagraph = text.Split('.');

String [email protected]"Cary Nelson and Stephen Watt. Martin Horton-Eddison. First Class Essays Hull, United Kingdom : Purple Peacock Press, 2012 Carol Tenopir and Donald King. Towards Electronic Journals: Realities for Librarians and Publishers. SLA, 2000. ISBN 0-87111-507-7. Scholarly Books and Peer Review in Academic Keywords: A Devil's Dictionary for Higher Education. ISBN 0-415-92203-8. [email protected]";
String[] sep = {". "};
String[] opt = Paragraph.Split(sep, StringSplitOptions.RemoveEmptyEntries);

string para = @"Cary Nelson and Stephen Watt. Martin Horton-Eddison. "First Class Essays" Hull, United Kingdom : Purple Peacock Press, 2012 Carol Tenopir and Donald King. "Towards Electronic Journals: Realities for Librarians and Publishers. SLA, 2000. ISBN 0-87111-507-7. "Scholarly Books" and "Peer Review" in Academic Keywords: A Devil's Dictionary for Higher Education. ISBN 0-415-92203-8. [email protected]";

string[] sentences = para.Split(new string[] { ". " }, StringSplitOptions.None);
for (int i = 0; i < sentences.Length; i++)
{
Console.WriteLine(sentences[i]);
}

You need to slit by dot(.) and space( ). Because otherwise its cannot separate emails well. Normally after every sentence there should be a space.

Happy Coding.....

Here is an example of what you want

String Paraghaph="Cary Nelson and Stephen Watt. Martin Horton-Eddison. "First Class Essays" Hull, United Kingdom : Purple Peacock Press, 2012 Carol Tenopir and Donald King. "Towards Electronic Journals: Realities for Librarians and Publishers. SLA, 2000. ISBN 0-87111-507-7. "Scholarly Books" and "Peer Review" in Academic Keywords: A Devil's Dictionary for Higher Education. ISBN 0-415-92203-8. [email protected]"//Given Paragraph

String[] opt=Paraghraph.Split('.');//Split Sentences on base of Character .
String mail="";
Bool mailflag=false;
foreach(String Row in opt) //Iterate for each string given in String Array opt
{
if (Row.contains("@") || mailflag==true)
{
if (mailflag==true)
{
Console.Writeline(mail+"."+Row);
mailflag=false;
Mail="";
}
else
{
mail=Row;
mailflag==true;
}
}
Else
{
Console.Writeline(Row+"\n"); //Print each line with two line breaks , if you want one thenYou can use Console.Writeline(Row);
}
}

OR

String [email protected]"Cary Nelson and Stephen Watt. Martin Horton-Eddison. "First Class Essays" Hull, United Kingdom : Purple Peacock Press, 2012 Carol Tenopir and Donald King. "Towards Electronic Journals: Realities for Librarians and Publishers. SLA, 2000. ISBN 0-87111-507-7. "Scholarly Books" and "Peer Review" in Academic Keywords: A Devil's Dictionary for Higher Education. ISBN 0-415-92203-8. [email protected]"//Given Paragraph
String[] Rows = Paraghraph.Split(new string[] { ". " }, StringSplitOptions.None);
foreach (String Row in Rows)
{
Console.Writeline(Row+"\n"); //Print each line with two line breaks , if you want one thenYou can use Console.Writeline(Row);
}

private void SeparateString(string input)
{
string[] stringSplit = input.Split('.');
for (int i = 0; i < stringSplit.Length; i++)
{
if (stringSplit[i].Contains('@'))
{
Console.Write(stringSplit[i] + ".");
}
else
{
Console.WriteLine(stringSplit[i] + ". " + Environment.NewLine);
}

}
}

This displays your desired result perfectly.

Category:c# Views:1 Time:2018-11-07
Tags: regex

Related post

  • How can I split a text into sentences using the Stanford parser? 2012-02-29

    How can I split a text or paragraph into sentences using Stanford parser? Is there any method that can extract sentences, such as getSentencesFromString() as it's provided for Ruby? --------------Solutions------------- You can check the DocumentPrepr

  • How to Split a paragraph into sentences separated by period(.) except when the period is a part of an abbreviation? 2012-01-09

    Consider this text paragraph Conservation groups call the 20-year ban a crucial protection for an American icon. The mining industry and some Republican members of Congress say it is detrimental to Arizona's economy and the nation's energy independen

  • How to Split a Paragraph into Sentences 2010-01-28

    I've been trying to use: $string="The Dr. is here!!! I am glad I'm in the U.S.A. for the Dr. quality is great!!!!!!"; preg_match_all('~.*?[?.!]~s',$string,$sentences); print_r($sentences); But it doesn't work on Dr., U.S.A., etc. Does anyone have any

  • Convert a paragraph into sentences with dynamic memory 2011-11-20

    How can I convert a paragraph into sentences? I have a function signature as follows: char **makeSentences(char *paragraph); In which: paragraph is a string containing several sentences. Paragraph ensures that each sentence ends with a period (.) and

  • Try to figure out a good way to split English document into sentences in C# 2012-01-17

    Is there a good way to split English document into sentences? I mean English document frequently includes Mr. Mrs. U.S.A, etc. It is difficult to separate them out. Do we need a special natural language library to accomplish this? I suspect that we n

  • How can I split my video into parts and save each file in different names? 2014-06-14

    How can I split my video into parts and save each file in different names? --------------Solutions------------- You are going to need a video edition package to do this, you cannot just cut up the video into segments at a file level. With an editor y

  • Splitting a Paragraph into 160 Character Pieces for Text Messaging 2009-11-13

    I'm having trouble with the logic of taking a paragraph of text and splitting it on words/sentences to send out in multiple text messages. Each text message can only have up to 160 characters. I want to cleanly break a paragraph up. Here is the solut

  • How does this regex divide text into sentences? 2010-09-30

    I know this regex divides a text into sentences. Can someone help me understand how? /(?<!\..)([\?\!\.])\s(?!.\.)/ --------------Solutions------------- Portions: ([\?\!\.])\s: split by ending character (.,!,or ?) which is followed by a whitespace

  • How do i split a String into multiple values? 2008-10-28

    How do you split a string? Lets say i have a string "dog, cat, mouse,bird" My actual goal is to insert each of those animals into a listBox, so they would become items in a list box. but i think i get the idea on how to insert those items if i know h

  • How can I split a string into chunks of two characters each in Perl? 2008-12-16

    How do I take a string in Perl and split it up into an array with entries two characters long each? I attempted this: @array = split(/../, $string); but did not get the expected results. Ultimately I want to turn something like this F53CBBA476 in to

  • How do I split a string into an array? 2009-07-30

    I want to split a string into an array. The string is as follows: :hello:mr.zoghal: I would like to split it as follows: hello mr.zoghal I tried ... string[] split = string.Split(new Char[] {':'}); and now I want to have: string something = hello ; s

  • How do I split a URL into 2 parts in Ruby? 2009-11-23

    I have a ruby script that downloads URLs from an RSS server and then downloads the files at those URLs. I need to split the URL into 2 components like so - http://www.website.com/dir1/dir2/file.txt --> 'www.website.com' and 'dir1/dir2/file.txt' I'

  • How do I split a file into n no of parts 2010-07-07

    I have a file contining some no of lines. I want split file into n no.of files with particular names. It doesn't matter how many line present in each file. I just want particular no.of files (say 5). here the problem is the no of lines in the origina

  • How do I split a vector into two columns to create ordered pairs for random assignment 2010-07-09

    I am trying to generate random pairs from 34 subjects for an experiment. Subjects will be assigned ID #'s 1-34. To generate the random ordered numbers (1-34) I used the following code: ### Getting a vector of random ordered numbers 1-34### pairs<-

  • How can I split a string into two separate arrays using the .NET framework? 2010-07-29

    I've got a string containing both ints and a string. How do I split it into two arrays, one for ints and one for the string? I also need to maintain the order because I'm writing a file parsing system that depends on reading in and correctly splittin

  • How does strtok() split the string into tokens in C? 2010-10-08

    Please explain me the working of strtok() function.The manual says it breaks the string into tokens. I am unable to understand from the manual what actually it does. I added watches on str and *pch to check its working, when the first while loop occu

  • How do I split a string into an array of characters? 2011-06-26

    var s = "overpopulation"; var ar = []; ar = s.split(); alert(ar); I want to string.split a word into array of characters. The above code doesn't seem to work - it returns "overpopulation" as Object.. How do i split it into array of characters, if ori

  • How do I split a string into three parts? 2011-07-03

    I have the string "001-1776591-7", and I want to divide it into 3 parts, "-" being the split parameter. I have already created two methods, for the first and last, but what about the second part of the string, how can I get that? More Info: I created

  • How do i split this array into two? 2011-09-28

    I have this array. Array ( [name] => Array ( [isRequired] => 1 [isBetween] => 1 [isAlphaLower] => [isLength] => ) [email] => Array ( [isEmail] => 1 ) [pPhone] => Array ( [isPhone] => ) ) i want to split the array into two.

  • Java - How do I split an array into four separate arrays? 2012-03-12

    I have an array which contains a string of numbers such as: 1011 and I wanted to split it up into four separate arrays containing each of those values. How do I do this? String [] array = {1,0,1,1}; //would I do something like this: array.substring(0

  • How to insert split-ed amount into a table without using loop? 2012-04-02

    I need to split an amount into multiple part and insert into an table called installment, how can i implement it without using loop? declare @installment as table (installment_index int identity(1,1), amount money, due_date datetime) declare @total_a

  • How do you split one album into several on the Zune? 2012-03-14

    I have CDs with several different pieces that I want to be listed as separate albums when I rip them to my Zune HD, but I can't figure out how to do that --------------Solutions------------- Hello kaminst, You could rip the different pieces using Win

  • How do I split one cell into a number of rows? 2013-10-23

    Hi there, I am trying to split one cell into a number of rows. I have no idea how to achieve this. Any help is much appreciated! Many thanks. --------------Solutions------------- To the best of my knowledge you can't do that. I'll bite, what is in th

  • How can I split a string into LETTERS and FLOAT/INTEGER numbers 2013-10-29

    I've been trying for the couple of days to split a string into letters and numbers. I've found various solutions but they do not work up to my expectations (some of them only separate letters from digits (not integers or float numbers/per say negativ

  • How can I split this string into an array? 2009-01-27

    My string is as follows: smtp:[email protected];SMTP:[email protected];X400:C=US;A= ;P=Test;O=Exchange;S=Jack;G=Black; I need back: smtp:[email protected] SMTP:[email protected] X400:C=US;A= ;P=Test;O=Exchange;S=Jack;G=Black; The problem is the semi-colons seperate

Copyright (C) dskims.com, All Rights Reserved.

processed in 0.093 (s). 11 q(s)