summaryrefslogtreecommitdiffhomepage
path: root/src/blog/new-sh/index.gsp
blob: b356046f2cbb5ae861c4b2674d6e952299e43bc1 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
html lang="en" {
	head { m4_include(head.gsp) }
	body {
		header {
			div {
				h1 {-m4_abbr(POSIX) Pitfalls}
				m4_include(nav.gsp)
			}

			figure .quote {
				blockquote {
					p {=
						Plan 9 argues that given a few carefully implemented abstractions it
						is possible to produce a small operating system that provides
						support for the largest systems on a variety of architectures and
						networks.
					}
				}
				figcaption {=
					@cite{-The Use of Name Spaces in Plan 9} by Rob Pike et al.
				}
			}
		}

		main {
			h2 #prologue {-Prologue}
			p {-
				Since the moment I decided to take software development more seriously,
				I have been absolutely enamored by the Shell @x-ref{-1} — the
				m4_abbr(POSIX) shell to be more specific.  The syntax is questionable
				at times, and the available resources outside of the m4_abbr(POSIX)
				specification itself are absolutely piss-poor as a result of the
				average *NIX user failing to understand the difference between
				@code{-/bin/sh} and Bash @x-ref{-2}.  What @em{-really} drew me into the
				Shell was the powerful idea of composability, and being able to combine
				simple tools to form a much more powerful one in only a handful of
				lines.  I talked more about this
				@a href="/blog/extend" {-in my previous post}.
			}

			p {-
				It didn’t take long for me to find issues with my beloved
				@code{-/bin/sh} however.  Like it or not, the modern shells we all use
				such as Bash and Zsh are all based on a that is approaching half a
				century in age.  It some things right — like the idea that you can use
				loops and conditional statements in a pipeline — but it also got a lot
				of things wrong, and these are things that we can improve on.  The most
				obvious deficiency in m4_abbr(POSIX) shells is the absolutely abhorrent
				handling of whitespace.
			}

			p {-
				There have been quite a few alternatives to the m4_abbr(POSIX) shell
				made over the years, although I find this to be an area that is
				shockingly underdeveloped.  If you’re reading this, I implore you to
				attempt to design your own shell, no matter how simple.  If you know
				how to make one, you can experiment with new ideas!  If you don’t, it’s
				a really great learning experience, even if all your shell can do is
				spawn a process.
			}

			aside {
				p data-ref="1" {-
					My first ever ‘programming’ language that I learnt was actually
					Windows Batch Script back on my elementary school laptops.
				}
				p data-ref="2" {-
					If you see someone using Oh-My-Zsh unironically, you can rest assured
					they know absolutely nothing about how their shell works.
				}
			}

			h2 #alternatives {-Alternatives to m4_abbr(POSIX)}
			p {-
				There are a few alternatives shells that have managed to garner a
				respectable userbase.  Fish, Powershell, Nushell, and Elvish just to
				name the ones I can think of off the top of my head have all managed to
				get a userbase while giving the finger to m4_abbr(POSIX).  I do believe
				that ditching m4_abbr(POSIX) is a necessity to create a half-decent
				modern shell.  I have used Fish for close to a year before and it is
				probably my favorite of the bunch; it tries to do its own thing with
				its own ideas, but it still remains highly familiar for those coming
				from m4_abbr(POSIX).
			}

			p {-
				I’m not entirely happy with Fish though.  Fish and most of the other
				modern shells all fall in my opinion to the classic trap of
				over-engineering; they try to do too much and lose sight of what the
				shell is fundamentally all about.  The philosophy of the shell is to
				manipulate streams by composing small- and simple tools, yet Fish
				bundles in a whole host of builtins that add nothing while replacing
				functionality that is already solved by existing tools.  You can read
				from @code{-/dev/urandom} to generate random numbers, yet Fish added a
				@code{-random} builtin.  You can do arbitrary-precision mathematics with
				the Bc and Dc calculators, yet Fish added the @code{-math} builtin.  The
				same goes for the @code{-string} builtin.
			}

			p {-
				I do appreciate Fish though, because despite loosing sight of what a
				shell should be (in my opinion), they still tried something new, and I
				respect that.  The same goes for all the other shells out there.  Also
				they definitely do get some things right.  Using Fish as an example once
				again, they decided to just remove the ‘?’ wildcard from globs entirely
				— a move I completely support.
			}

			p {-
				All in all, while I don’t think any of these ‘mainstream’ alternatives
				got it right, they are a great source of inspiration for me as to what I
				should or should not do should I make my own shell.
			}

			h2 #andy {-Introducing Andy}
			p {-
				Andy is a shell that I’ve been meaning to make for around 2 years now
				which never materialized as a result of a lack of dedicated focus, and a
				lack of a thought-out vision and -design.  Part of why I’m writing this
				in fact is to help me develop a proper vision for what I want Andy to
				be; I find that discussing and writing about things helps a lot with
				this kind of thing.
			}

			p {-
				I want the philosophy of Andy to reflect that of the original Bourne
				Shell, and the less features the better — ‘less is more’ as Ludwig Mies
				van der Rohe famously said.  That being said, not all features should be
				thrown to the wayside; if a feature is simple to understand, simple to
				implement, and solves a real problem, there is no problem in adding it.
			}

			p {-
				Take process redirection for example.  To properly compare the outputs
				of two processes in m4_abbr(POSIX) shell, we need to do this whole
				rigmarole:
			}

			figure {
				pre {= m4_fmt_code(proc-diff.sh.gsp) }
			}

			p {-
				Now compare that to the Bash solution using process redirections:
			}

			figure {
				pre {= m4_fmt_code(proc-diff.bash.gsp) }
			}

			p {-
				The Bash solution is more readable, and far easier to understand at a
				glance.  It’s also a lot better functionally in that it doesn’t require
				you to need to need to manually cleanup your temporary file (something
				which might fail if your script receives certain signals).  It’s more
				efficient too; instead of waiting for @code{-cmd2} to write all its
				output to a temporary file for us to read, both @code{-cmd1} and
				@code{-cmd2} are run in parallel to each other.  This can obviously be
				solved using named pipes, but now we’re adding more complexity to our
				application.
			}

			p {-
				There are a few fundamental ‘problems’ I want to fix in Andy.  The
				first is whitespace handling; safe m4_abbr(POSIX) shell scripts will
				contain almost as many quotation marks to avoid word-splitting as Lisp
				programs contain parenthesis.  This is an absolute must, under no
				circumstance should strings be expanding into even more strings without
				the explicit consent of the user; it’s a recipe for disaster and it’s
				the shell-equivalent of the null-pointer-exception.
			}

			p {-
				The second major fix I want to make is in terms of datatypes.  For this
				I took major inspiration from Plan 9’s Rc shell.  While the fundamental
				datatype of the shell is the @em{-stream} — which is well-represented by
				the string — we very often are working with @em{-lists} of items.  Lists
				of filenames, lists of regular expression matches, etc.  I want lists to
				be a first-class citizen of Andy.
			}

			p {-
				Outside of these major changes, there are other minor changes I want to
				make.  I want to use a C-style syntax similar (but even simpler) than
				that of Rc.  The whole ‘if-then’ and ‘esac’ business is both overly
				verbose for a language that needs to work well in a m4_abbr(REPL), and
				just plain ugly.  A friend of mine even suggested that the reason the
				Bourne Shell decided to call them ‘case-statements’ instead of
				‘switch-statements’ like every other language was that nobody would
				remember how to spell ‘hctiws’.
			}

			p {-
				I also want to allow functions to take named arguments, and to
				completely remove the need for newline-escaping, allowing for readable
				multiline pipelines.
			}

			p {-
				In ‘@a
					href="https://blog.plover.com/Unix/whitespace.html"
				{=
					The shell and its crappy handling of whitespace
				}’, the author Mark Dominus offers an example piece of shell script to
				rename @code{-*.jpeg} files to @code{-*.jpg}.  Take note of all the
				quoting that is required in his example in order to properly handle
				filenames with spaces, as well as the seemingly useless ‘do’ keyword:
			}

			figure {
				pre .sh {= m4_fmt_code(suf.bash.gsp) }
			}

			p {-
				Here is how I envision such a solution in Andy:
			}

			figure {
				pre .sh {= m4_fmt_code(suf.an.gsp) }
			}

			p {-
				Notice the complete lack of quotes in the Andy solution, because it
				lacks the retardation of automatic word-expansion.  The syntax is also
				minimal, fast to type, and visually out of the way.  C-style braces work
				well here; they’re only one character each.  We can also completely
				remove the ‘do’ keyword, and potentially even make the binding of an
				iteration variable optional — I’m not sure about that yet though.
			}

			p {-
				I’m currently in the process of actively developing Andy, and I will
				probably make another post on here soon detailing the current progress
				and features of the shell.  I hope to soon be able to use Andy as my
				primary shell; both for scripting and interactive use.
			}
		}

		hr{}

		footer { m4_footer }
	}
}