From bda44e93541fa478abf3ce4b3461f026a90fa8cb Mon Sep 17 00:00:00 2001 From: Thomas Voss Date: Mon, 11 Sep 2023 05:15:20 +0200 Subject: Move the site from HTML to GSP --- src/prj/mmv/index.html | 667 ------------------------------------------------- 1 file changed, 667 deletions(-) delete mode 100644 src/prj/mmv/index.html (limited to 'src/prj/mmv/index.html') diff --git a/src/prj/mmv/index.html b/src/prj/mmv/index.html deleted file mode 100644 index 09aadb1..0000000 --- a/src/prj/mmv/index.html +++ /dev/null @@ -1,667 +0,0 @@ - - - - m4_include(head.html) - - -
-
-

Moving Files the Right Way

- m4_include(nav.html) -
- -
-
-

I think the OpenBSD crowd is a bunch of masturbating - monkeys, in that they make such a big deal about - concentrating on security to the point where they pretty much - admit that nothing else matters to them.

-
-
- Linux Torvalds -
-
-
- -
-

- - You can find the mmv git repository over at - sourcehut - or GitHub. - -

- -

- NOTE: As of the - v1.2.0 release - there is now also the mcp utility. It behaves the same as - the mmv utility but it copies files instead of moving them. - It also doesn’t support the ‘-n’ flag as it doesn’t need to - deal with backups. -

- -

Table of Contents

- - - -

Prologue

-

- File moving and renaming is one of the most common tasks we - undertake on the command-line. We basically always do this with - the mv utility, and it gets the job done most of the - time. Want to rename one file? Use mv! Want to - move a bunch of files into a directory? Use mv! - How could mv ever go wrong? Well I’m glad you asked! -

- -

Advanced Moving and Pitfalls

-

- Let’s start off nice and simple. You just inherited a C project - that uses the sacrilegious - camelCase - naming convention for its files: -

- -
-
m4_fmt_code(ls-files.sh.html)
-
- -

- This deeply upsets you, as it upsets me. So you decide you want - to switch all these files to use - snake_case, - like a normal person. Well how would you do this? You use - mv! This is what you might end up doing: -

- -
-
m4_fmt_code(manual-mv.sh.html)
-
- -

- Well… it works I guess, but it’s a pretty shitty way of renaming - these files. Luckily we only had 5, but what if this was a much - larger project with many more files to rename? Things would get - tedious. So instead we can use a pipeline for - this: -

- -
-
m4_fmt_code(camel-to-snake-naïve.sh.html)
-
- - - -

- That works and it gets the job done, but it’s not really ideal is - it? There are a couple of issues with this. -

- -
    -
  1. -

    - You’re writing more complicated code. This has the - obvious drawback of potentially being more error-prone, - but also risks taking more time to write than you’d like - as you might have forgotten if xargs - actually has an ‘-L’ option or not (which - would require reading the - xargs(1) manual). -

    -
  2. -
  3. -

    - If you try to rename the file foo - to bar but bar already exists, you end - up deleting a file you may not have wanted to. -

    -
  4. -
  5. -

    - In a similar vein to the previous point, you need to be - very careful about schemes like renaming the - file a to b and b - to c. You run the risk of turning a - into c and losing the file b entirely. -

    -
  6. -
  7. -

    - Moving symbolic links is its own whole can of worms. If - a symlink points to a relative location then you need to - make sure you keep pointing to the right place. If the - symlink is absolute however then you can leave it - untouched. But what if the symlink points to a file - that you’re moving as part of your batch move operation? - Now you need to handle that too. -

    -
  8. -
- -

Name Mapping with mmv

- -

- What is mmv? It’s the solution to all your - problems, that’s what it is! mmv takes as its - argument(s) a utility and that utilities arguments and uses that - to create a mapping between old and new filenames — similar to - the map() function found in many programming - languages. I think to best convey how the tool functions, I - should provide an example. Let’s try to do the same thing we did - previously where we tried to turn camelCase files to snake_case, - but using mmv: -

- -
-
m4_fmt_code(camel-to-snake-smart.sh.html)
-
- -

Let me break down how this works.

- -

- mmv starts by reading a series of filenames - separated by newlines from the standard input. Yes, sometimes - filenames have newlines in them and yes there is a way to handle - them but I shall get to that later. The filenames that - mmv reads from the standard input will be referred - to as the input files. Once all the input files have - been read, the utility specified by the arguments is spawned; in - this case that would be sed with the argument - 's/[A-Z]/\L_&/g'. The input files are then piped - into sed the exact same way that they would have - been if we ran the above commands without mmv, and - the output of sed then forms what will be referred - to as the output files. Once a complete list of output - files is accumulated, each input file gets renamed to its - corresponding output file. -

- -

- Let’s look at a simpler example. Say we want to rename 2 files - in the current directory to use lowercase letters, we could use - the following command: -

- -
-
m4_fmt_code(mmv-tr.sh.html)
-
- -

- In the above example mmv reads 2 lines from - standard input, those being LICENSE - and README. Those are our 2 input files now. - The tr utility is then spawned and the input files - are piped into it. We can simulate this in the shell: -

- -
-
m4_fmt_code(tr.sh.html)
-
- -

- As you can see above, tr has produced 2 lines of - output; these are our 2 output files. Since we now have our 2 - input files and 2 output files, mmv can go ahead - and rename the files. In this case it will rename - LICENSE to license and - README to readme. For some examples, check - the examples section of this page down - below. -

- -

Filenames with Embedded Newlines

- -

- People are retarded, and as a result we have filenames with - newlines in them. All it would have taken to solve this issue - for everyone was for literally anybody during - the early UNIX days to go “hey, this is a bad idea!”, - but alas, we must deal with this. Newlines are of course not - the only special characters filenames can contain, but they are - the single most infuriating to deal with; the UNIX utilities all - being line-oriented really doesn’t work well with these files. -

- -

- So how does mmv deal with special characters, and - newlines in particular? Well it does so by providing the user - with the -0 and -e flags: -

- -
-
-0
-
-

- Tell mmv to expect its input to not be - separated by newlines (‘\n’), but by NUL - bytes (‘\0’). NUL bytes are the only - characters not allowed in filenames besides forward - slashes, so they are an obvious choice for an - alternative separator. -

-
-
-e
-
-

- Encode newlines in filenames before passing them to the - provided utility. Newline characters are replaced by the - literal string ‘\n’ and backslashes by the - literal string ‘\\’. After processing, the - resulting output is decoded again. -

-

- If combined with the -0 flag, then while - input will be read assuming a NUL-byte input-seperator, - the encoded input files will be written to the spawned - process newline-seperated. -

-
-
- -

The Simple Case

- -

- In order to better understand these flags and how they work - let’s go though another example. We have 2 files — one with and - one without an embedded newline — and our goal is to simply - reverse these filenames. In this example I am going to be - displaying newlines in filenames with the “$'\n'” - syntax as this is how my shell displays embedded newlines. -

- -

- We can start by just trying to naïvely pass these 2 files - to mmv and use rev to reverse the - names, but this doesn’t work: -

- -
-
m4_fmt_code(mmv-rev.sh.html)
-
- -

- The reason this doesn’t work is because due to the line-oriented - nature of ls and rev, we are actually - trying to rename the files foo, bar, and - baz to the new filenames zab, - rab, and oof. As can be seen in the following - diagram, the embedded newline is causing our input to be ambiguous - and mmv can’t reliably proceed - anymore 1: -

- -
- -
- - - -

- The first thing we need to do in order to proceed is to pass - the -0 flag to mmv. This will - tell mmv that we want to use the NUL-byte as our - input separator and not the newline. We also need ls - to actually provide us with the filenames delimited by NUL-bytes. - Luckily GNU ls gives us the - --zero flag to do just that: -

- -
-
m4_fmt_code(mmv-rev-zero.sh.html)
-
- -

- So we’re getting places, but we aren’t quite there yet. The - issue we’re getting now is that mmv recieved 2 - input files from the standard input, but rev - produced 3 output files. Why is that? Well let’s try our hand - at a little bit of command-line debugging with sed: -

- -
-
m4_fmt_code(sed-debugging.sh.html)
-
- -

- If you aren’t quite sure what the above is doing, here’s a quick - summary: -

- - - -

- In the sed output, we can see that $ - represents the end of a line, and \000 represents - the NUL-byte. All looks good here, we have two inputs seperated - by NUL-bytes. Now let’s try to throw in rev: -

- -
-
m4_fmt_code(sed-debugging-rev.sh.html)
-
- -

- Well wouldn’t you know it? Since rev also - works with newline-seperated input, it reversed out NUL-byte - seperators and now gives us 3 outputs. Luckily the folks over - at util-linux provided us with the -0 flag - here too, so that we can properly handle NUL-delimited input. - Combining all of this together we get a final working product: -

- -
-
m4_fmt_code(reverse-embedded-newline.sh.html)
-
- -

Encoding Newlines

- -

- Sometimes we want to rename a bunch of files, but the command we - want to use doesn’t support NUL-bytes as nicely as we would - like. In these cases, you may want to consider encoding your - newline characters into the literal string ‘\n’ and - then passing your input newline-seperated to your given command - with the -e flag. -

- -

- For a real-world example, perhaps you want to edit some - filenames in vim, or whatever other editor you use. Well we can - do this incredibly easily with the vipe utility - from - the moreutils - collection. The vipe command simply reads input - from the standard input, opens it up in your editor, and then - prints the resulting output to the standard output; perfect - for mmv! We do not really want to deal with - NUL-bytes in our text-editor though, so let’s just encode our - newlines: -

- -
-
m4_fmt_code(vipe.sh.html)
-
- - - -

- When running the above code example, you will see the following - in your editor: -

- -
-
m4_fmt_code(vim.html)
-
- -

- After you exit your editor, mmv will decode all - occurances of ‘\n’ back into a newline, and all - occurances of ‘\\’ back into a backslash: -

- -
- -
- -

Individual Execution

-

- The previous examples are great and all, but what do you do if - your mapping command doesn’t have the concept of an input - seperator at all? This is where the -i flag comes - into play. With the -i flag we can - get mmv to execute our mapping command for every - input filename. This means that as long as we can work with a - complete buffer, we don’t need to worry about seperators. -

- -

- To be honest, I cannot really think of any situation where you - might actually need to do this. If you can think of one, - please email me and - I’ll update the example on this page. Regardless, let’s imagine - that we wanted to rename some files so that their filenames are - replaced with their filename - - SHA-1 hash. - On Linux we have the sha1sum program which reads - input from the standard input and outputs the SHA-1 hash. This - is how we would use it with mmv: -

- -
-
m4_fmt_code(sha1sum-long-example.sh.html)
-
- -

- Another approach is to invoke mmv twice: -

- -
-
m4_fmt_code(sha1sum-short-example.sh.html)
-
- -

- If you are confused about why we need to make a call - to awk, it’s because the sha1sum - program outputs 2 columns of data. The first column is our hash - and the second column is the filename where the to-be-hashed - data was read from. We don’t want the second column. -

- -

- Unlike in previous examples where one process was spawned to map - all our filenames, with the -i flag we are spawning - a new instance for each filename. If you struggle to visualize - this, perhaps the following diagrams help: -

- -
-
Invoking mmv without -i
- -
- -
-
Invoking mmv with -i
- -
- -

Safety

-

- When compared to the standard for f in *; do mv $f …; - done or ls | … | xargs -L2 mv - constructs, mmv is significantly more safe to use. - These are some of the safety features that are built into the - tool: -

- -
    -
  1. - If the number of input- and output files differs, execution - is aborted before making any changes. -
  2. -
  3. - If an input file is renamed to the name of another input - file, the second input file is not lost (i.e. you can rename - a to b and b to a with - no problem). -
  4. -
  5. - All input files must be unique and all output files must be - unique. Otherwise execution is aborted before making any - changes. -
  6. -
  7. - In the case that something goes wrong during execution - (perhaps you tried to move a file to a non-existant - directory, or a syscall failed), a backup of your input - files is saved automatically by mmv for - recovery. -
  8. -
- -

- Due to the way mmv handles #2, when things do go - wrong you may find that all of your input files have - disappeared. Don’t worry though, mmv takes a - backup of your code before doing anything. If you - run mmv with the -v option for verbose - output, you’ll notice it backing up your stuff in - the $XDG_CACHE_DIR directory: -

- -
-
m4_fmt_code(mmv-verbose.sh.html)
-
- -

- Upon successful execution - the $XDG_CACHE_DIR/mmv/TIMESTAMP directory will be - automatically removed, but it remains when things go wrong so - that you can recover any missing data. The names of the - backup-subdirectories in the $XDG_CACHE_DIR/mmv - directory are timestamps of when the directories were created. - This should make it easier for you to figure out which directory - you need to recover if you happen to have multiple of these. -

- -

Examples

- - - -

Swap the files foo and bar:

-
-
m4_fmt_code(examples/swap.sh.html)
-
- -

- Rename all files in the current directory to use hyphens (‘-’) - instead of spaces: -

-
-
m4_fmt_code(examples/hyphens.sh.html)
-
- -

- Rename a given list of movies to use lowercase letters and - hyphens instead of uppercase letters and spaces, and number them - so that they’re properly ordered in globs (e.g. rename The - Return of the King.mp4 to - 02-the-return-of-the-king.mp4): -

-
-
m4_fmt_code(examples/number.sh.html)
-
- -

- Rename files interactively in your editor while encoding newline - into the literal string ‘\n’, making use - of vipe(1) from moreutils: -

-
-
m4_fmt_code(examples/vipe.sh.html)
-
- -

- Rename all C source code- and header files in a git repository - to use snake_case instead of camelCase using - the GNU - sed(1)\n’ extension: -

-
-
m4_fmt_code(examples/camel-to-snake.sh.html)
-
- -

- Lowercase all filenames within a directory hierarchy which may - contain newline characters: -

-
-
m4_fmt_code(examples/lowercase.sh.html)
-
- -

- Map filenames which may contain newlines in the current - directory with the command ‘cmd’, which itself does - not support nul-byte separated entries. This only works - assuming your mapping doesn’t require any context outside of the - given input filename (for example, you would not be able to - number your files as this requires knowledge of the input files - position in the input list): -

-
-
m4_fmt_code(examples/i-flag.sh.html)
-
-
- -
- - - - -- cgit v1.2.3