6 Linux csplit command examples for beginners

When working at the command line in Linux, you may find yourself in a situation where you need to split a file into several parts. If you’re already looking for a way to do this, or just want to know how it can be done, you’ll be happy to know if there is a tool out there – called csplit – is designed just for this purpose.

In this article, we will discuss the basics of this tool and also learn how you can use it. But before we do that, it’s worth noting that all commands / instructions mentioned here have been tested on Ubuntu 16.04 LTS.

Linux csplit command

This is how the person page defines the csplit command:

csplit - разбивка файла на секции, определенных контекстных строк

Below is its general syntax:

csplit [OPTION]... FILE PATTERN...

Separate small files created csplit command have names like xx00 and xx01.

The following types of Q&A examples should give you a good idea of ​​how the csplit command works.

Q1. How do I split files based on the number of lines?

Suppose your file is 6 lines long, and the requirement to split that file on the third line, then this can be done by passing “3” as the command line argument after the command and file name.

For example, in our case, file1 contains the following lines:

1 Asia
2 Africa
3 Europe
4 North America
5 South America
6 Australia

And here’s the command we ran:

$ csplit file1 3
27
66

The numbers received on the output are the number of bytes for the files of the executed command. Needless to say, two files were produced in the output, namely xx00 and xx01.

The contents of these files confirm the division took place on line no. 3.

Q2. How do I split files using regular expressions?

You can also use regular expressions using the csplit command. For example, in the previous case, if you want the tool to repeat the drawing one more time, you can do it with the following command:

csplit file1 3 {1}

So, in this case, three output files were produced:

$ cat xx00
1 Asia
2 Africa
$ cat xx01
3 Europe
4 North America
5 South America
$ cat xx02
6 Australia

Q3. How do I change my own prefix instead of the default “xx”?

By default, files produced by the csplit command are prefixed with xx in the output. However, if you want, you can change the prefix using the command line parameter -fwhich requires a new prefix as input.

For example, the following command will create files prefixed with “htf”.

csplit file1 1 -f htf

$ csplit file1 1 -f htf
0
93
$ ls htf*
htf00 htf01

Q4. How do I make csplit not delete output files on error?

The csplit command, by default, removes the output files (any if created) as soon as the command encounters an error situation. For example, the following example confirms that the output file was not ultimately created:

$ csplit file1 1 2 {3}
13
28
36
csplit: '2': line number out of range on repetition 3
16
$ ls xx*
ls: cannot access 'xx*': No such file or directory

However, if you want, you can change this behavior using the option -k in a team. For example, the same command was run again, but with this option, and the output files were not deleted this time.

$ csplit -k file1 1 2 {3}
13
28
36
csplit: '2': line number out of range on repetition 3
16
$ ls xx*
xx00 xx01 xx02 xx03

Q5. How to suppress lines that match the input pattern?

The csplit command also provides the ability to suppress lines that match an input pattern. Option in question –Suppress-matched

For example, the following command splits file (file1) on line 2 (xx00 will contain line 1, while xx11 will contain the rest of the lines).

csplit file1 2

But if you want to suppress line 2, then you can run the following command:

csplit --suppress-matched file1 2

Q6. How to use an arbitrary number of digits instead of the standard 2?

Just like the prefix itself, the number of digits that follow the prefix in the output files is also configurable. So suppose you want to give names like xx000 and xx0001, you can do it with the option -n a command line that requires a number to indicate the new number of digits.

For example:

csplit -n 1 file1 2

The above command will output filenames like xx0, xx1, and so on.

Conclusion

The Linux csplit command is not commonly used by a user, but it is certainly an important utility and you should at least know it. We’ve covered most of the basic examples and command line options here. Try them and then go to man tool pageto find out more about her.

Sidebar