Introduction to Stream Editor (sed)
Photo by Marc-Olivier Jodoin on Unsplash

Introduction to Stream Editor (sed)

Following to my previous sed story :

I write this article to help you to start with sed and enjoy the speed.

Installation

For Unix-like operating system (Mac OS, Linux, Unix) sed already pre-installed and for Windows user, there are 4 options for you : Windows Subsystem for Linux (WSL), Virtual Machine like virtualbox, Git-Bash, and Cygwin. For simplicity, I would recommend Cygwin if you only want to run Unix tools in windows cmd terminal or Git-Bash if you also need to use git as version control.

Syntax

sed [OPTIONS] [COMMAND] [INPUT FILE]        

[INPUT FILE] is the text file that we want to edit, it can be csv, txt, tsv, html, or any other text based file. The output will be print on screen by default, if you want to save the output into other file you can add > [OUTPUT FILE] at the end

sed [OPTIONS] [COMMAND] [INPUT FILE] > [OUTPUT FILE]        

If you want to modify the existing INPUT FILE you can save the changes back into INPUT FILE by add -i (in-place) option.

sed -i [COMMAND] [INPUT FILE]        

You can find very comprehensive documentation of sed, but here I just want to share my favorite command :

sed [OPTIONS] s/OLD/NEW/g [INPUT FILE]        

"s" is command for substitute to find OLD pattern/string and replace with NEW, "g" is global (optional) to replace all OLD to NEW, without g only the first OLD will be replaced by NEW. For advance searching/pattern recognition you can use Regular Expression

Examples

As example (in Windows cmd) we use SampleData.csv with the content print into screen with windows cmd "type" command:

C:\Users\setya\Documents\Demo>type SampleData.csv
RecordNo,Data
Record1,Data1
Record2,Data2
Record3,Data3
Record4,Data4
Record5,Data5        

With /g (global) flag the command will executed for every lines

C:\Users\setya\Documents\Demo> sed "s/Data/NewData/g" SampleData.csv
RecordNo,NewData
Record1,NewData1
Record2,NewData2
Record3,NewData3
Record4,NewData4
Record5,NewData5        

Without /g the command only executed once on the first occurance.

C:\Users\setya\Documents\Demo> sed "s/Data/NewData/" SampleData.csv
RecordNo,NewData
Record1,Data1
Record2,Data2
Record3,Data3
Record4,Data4
Record5,Data5        

Please remember that all command need to put inside (" ") double quote for windows (can be single quote for linux/MacOS) and the output so far only print on screen without saved anywhere. Saving can be done as follow :

C:\Users\setya\Documents\Demo> sed "s/Data/NewData/g" SampleData.csv > NewFile.csv

C:\Users\setya\Documents\Demo>type NewFile.csv
RecordNo,NewData
Record1,NewData1
Record2,NewData2
Record3,NewData3
Record4,NewData4
Record5,NewData5

C:\Users\setya\Documents\Demo>type SampleData.csv
RecordNo,Data
Record1,Data1
Record2,Data2
Record3,Data3
Record4,Data4
Record5,Data5
        

On above example we save the changes into NewFile.csv and we can see the modified content inside NewFile.csv and after execution we can observe the original SampleData.csv was not changed.

C:\Users\setya\Documents\Demo> sed -i "s/Data/NewData/g" SampleData.csv

C:\Users\setya\Documents\Demo>type SampleData.csv
RecordNo,NewData
Record1,NewData1
Record2,NewData2
Record3,NewData3
Record4,NewData4
Record5,NewData5        

On above example we use -i option in and instead of print the output on the screen sed save the changes by overwrite original file and we can observed the changes in SampleData.csv as shows in the last command line

Specific Occurrence

By default sed work on INPUT FILE line by line, so we can specify which line that we wan to apply sed by provide (1) Line Number or (2) Matched Pattern before the COMMAND.

C:\Users\setya\Documents\Demo>type SampleData.csv
RecordNo,NewData
Record1,NewData1
Record2,NewData2
Record3,NewData3
Record4,NewData4
Record5,NewData5

C:\Users\setya\Documents\Demo>sed "s/NewData/Replaced/g" SampleData.csv
RecordNo,Replaced
Record1,Replaced1
Record2,Replaced2
Record3,Replaced3
Record4,Replaced4
Record5,Replaced5
        

On above example the sed replace all NewData with Replaced from previous example file.

C:\Users\setya\Documents\Demo>sed "2 s/NewData/Replaced/g" SampleData.csv
RecordNo,NewData
Record1,Replaced1
Record2,NewData2
Record3,NewData3
Record4,NewData4
Record5,NewData5
        

On above example we put 2 as specific line number (the second line) for COMMAND to be executed.

C:\Users\setya\Documents\Demo>sed "/Record2/ s/NewData/Replaced/g" SampleData.csv
RecordNo,NewData
Record1,NewData1
Record2,Replaced2
Record3,NewData3
Record4,NewData4
Record5,NewData5        

On above example we put /Record2/ (inside / /) as search pattern and sed will only execute COMAND where Record2 pattern found (line no 3)

We can also combine multiple line command

C:\Users\setya\Documents\Demo>sed "2,4 {s/NewData/Replaced/g}" SampleData.csv
RecordNo,NewData
Record1,Replaced1
Record2,Replaced2
Record3,Replaced3
Record4,NewData4
Record5,NewData5        

Adding 2,4 in the first line means execute command from line no 2 to line no 4

C:\Users\setya\Documents\Demo>sed "/[24]/ {s/NewData/Replaced/g}" SampleData.csv
RecordNo,NewData
Record1,NewData1
Record2,Replaced2
Record3,NewData3
Record4,Replaced4
Record5,NewData5        

Adding /[2,4]/ is put regular expression (inside / /) [2,4] means (2 or 4) so the command will be executed for every line contain character 2 or character 4

Multiple File

We can execute sed command for multiple INPUT FILE using windows Glob Pattern

C:\Users\setya\Documents\Demo>dir/w File*.csv
?Volume in drive C is Windows
?Volume Serial Number is 9CB9-9E65


?Directory of C:\Users\setya\Documents\Demo


File1.csv? ? ?File10.csv? ? File100.csv? ?File11.csv? ? File12.csv? ? File13.csv
File14.csv? ? File15.csv? ? File16.csv? ? File17.csv? ? File18.csv? ? File19.csv
File2.csv? ? ?File20.csv? ? File21.csv? ? File22.csv? ? File23.csv? ? File24.csv
File25.csv? ? File26.csv? ? File27.csv? ? File28.csv? ? File29.csv? ? File3.csv
File30.csv? ? File31.csv? ? File32.csv? ? File33.csv? ? File34.csv? ? File35.csv
File36.csv? ? File37.csv? ? File38.csv? ? File39.csv? ? File4.csv? ? ?File40.csv
File41.csv? ? File42.csv? ? File43.csv? ? File44.csv? ? File45.csv? ? File46.csv
File47.csv? ? File48.csv? ? File49.csv? ? File5.csv? ? ?File50.csv? ? File51.csv
File52.csv? ? File53.csv? ? File54.csv? ? File55.csv? ? File56.csv? ? File57.csv
File58.csv? ? File59.csv? ? File6.csv? ? ?File60.csv? ? File61.csv? ? File62.csv
File63.csv? ? File64.csv? ? File65.csv? ? File66.csv? ? File67.csv? ? File68.csv
File69.csv? ? File7.csv? ? ?File70.csv? ? File71.csv? ? File72.csv? ? File73.csv
File74.csv? ? File75.csv? ? File76.csv? ? File77.csv? ? File78.csv? ? File79.csv
File8.csv? ? ?File80.csv? ? File81.csv? ? File82.csv? ? File83.csv? ? File84.csv
File85.csv? ? File86.csv? ? File87.csv? ? File88.csv? ? File89.csv? ? File9.csv
File90.csv? ? File91.csv? ? File92.csv? ? File93.csv? ? File94.csv? ? File95.csv
File96.csv? ? File97.csv? ? File98.csv? ? File99.csv
? ? ? ? ? ? ?100 File(s)? ? ? ? ?10,200 bytes
? ? ? ? ? ? ? ?0 Dir(s)? 177,604,562,944 bytes free


C:\Users\setya\Documents\Demo>sed "s/NewData/Replaced/g" File*.csv        

On above example the sed command will be executed to all Files that meet glob pattern File*.cvs (File1.csv - File100.csv)

For further exploration you can refer to :

https://www.gnu.org/software/sed/manual/

https://www.grymoire.com/Unix/Sed.html


要查看或添加评论,请登录

Purnomo Setyawendha的更多文章

  • Minkowski distance in R vs Python

    Minkowski distance in R vs Python

    Which one you prefer ? in R or python (credit to ChatGPT for creating code example)

  • Why R is my first language

    Why R is my first language

    Hopefully this is not about Python vs R (again) The motivation for this article as I found most of R vs Python…

    2 条评论
  • Don't Repeat Yourself (in Command Line)

    Don't Repeat Yourself (in Command Line)

    hy you should not repeat yourself Andrew Hunt and David Thomas in their Book : The Pragmatic Programmer have introduced…

    1 条评论
  • Why I love stream editor (sed)

    Why I love stream editor (sed)

    We have situation here ..

  • Managing (Technical) Disagreements

    Managing (Technical) Disagreements

    Human is a very complex creature by design, and diversity is imbedded into our DNA combined with background and…

    1 条评论
  • Starting with a Boring Stuff

    Starting with a Boring Stuff

    On my starting point in python three years ago, I was too excited, cannot hardly wait to jump into machine learning and…

  • Geopandas Update for Python 3.8.5

    Geopandas Update for Python 3.8.5

    I just recently update my python to 3.8.

    1 条评论
  • A "sed" Story for Pipeline Integrity Engineer

    A "sed" Story for Pipeline Integrity Engineer

    "sed" story is a story about surviving of an old tools, if we refer to "Jadul" term in Indonesian urban dictionary…

    2 条评论
  • How slow R ....U ?

    How slow R ....U ?

    I have to put disclaimer since R was my first language so I may get biased. It is depend on what you want to do, but as…

    2 条评论
  • Recent changes in R spatial

    Recent changes in R spatial

    I think this is a very important update for all R-Spatial community member : sf, which replaces sp terra, which aims to…

    1 条评论

社区洞察

其他会员也浏览了