notes-computer-jasper-jasperSyntaxNotes1

Difference between revision 1 and current revision

No diff available.

"Even Herb Sutter, of C++ fame agrees:

> One of the things Go does that I would love C++ to do is a complete left-to-right declaration syntax. That is a good thing because the left-to-right makes you end up in a place where you have no ambiguities, you can read your code in a more strait-forward way, which also makes it more toolable.

(6:00) http://channel9.msdn.com/Shows/Going+Deep/Herb-Sutter-C-Ques... "

--

http://en.wikipedia.org/wiki/Most_vexing_parse

C var decl syntax is (unless you know all the detailed rules) ambiguous because, since you can do initialization and declaration in the same line, sometimes it gets confusing, e.g.

  TimeKeeper time_keeper(Timer());

is that declaring an instance named time_keeper of class TimeKeeper? and calling its constructor, or is it "a function declaration for a function time_keeper which returns an object of type TimeKeeper? and takes a single (unnamed) argument which is a function returning type Timer (and taking no input). "?

--

could maybe do const/let and #ifdef/if distinction using a 'when to evaluate' sigil?

--

ambiguous precedence can be ignored if all relevant ops are marked associative with each other

--

" This points out that designing a syntax after a keyboard is a tricky business. Some national layouts make it clear that they are made with no consideration of programming needs what so ever. I definitely think there should be several layouts tuned to particular needs (but similar enough to still be somewhat usable by different users). Programmers use the national characters (such as åäö) less often than operators, so why are we forced to work on keyboards that have dedicated keys for (åäö) but squeeze up to 3 important operator-characters on other keys? For your information, here are some examples (plain, shift, altgr): 2 " @ 7 / { 8 ( [ 9 ) ] 0 = } + ? \ < >

¨ ^ ~ -all of them are normally "dead" meaning they get attached to other characters so you need to type them twice and then delete one to get a single one. "
-this sucks particularly if you work with a shell

"

    Here are the only special characters I can get with single key press without going to numpad ,.-'+§<
    It sucks to code with Finnish keyboard layout. Swedes probably use same. 3 extra vowels make it hard. Hmm.
    Hmmm. This discussion makes me think that should I configure full custom keyboard layout for coding.
    If getting rid of that shift is such a great advantage in programming."

"

    As a curiosity:
    The only characters that are available unshifted on (practically) all keyboards are letters a-z, digits 0-9 punctuation ,. and addition/subtraction +-
    For all other characters, there is a very large percentage of keyboards where some form of shifting is necessary.
    All the paragraphs that mention () being vastly superior to [] are therefore factually incorrect.
    So yeah, that's silly :-)"

--

since punctuation is hard to search for using std tools, perhaps require punctuation infix operator assignments to be imported individually rather than with an import *

--

mb prohibit defining symbolic operators without first defining an alphanumeric function to bind to them

--

consider scala's syntax for anonymous functions:

l.map( x => x*2 )

note that i think the same syntax is reused for function types, e.g.

(fun: List[T] => T)

--

i think golang has an LALR(1) grammar?

--

LL \subset LALR(1) \subset LR \subset? GLR

--

apparently if you have an LL(k) grammar you can parse it in linear time without backtracking by a recursive descent parser (a linear recursive descent parser without backtracing is also called a 'predictive parser'), which is intuitive and easy to write. this sounds like a good idea to make it easy for others to write macros by hand (otoh by that time they get the AST, right, so it doesn't really matter?). ANTLR also does LL(*).

so i guess we should make Jasper LL(k) if possible, or LALR(1) if not.

what is scheme? i bet it's LL(*). not sure tho.

golang also makes a big deal about being parsable without a symbol table. i guess that's important for efficiency.

--

http://programmers.stackexchange.com/questions/19541/what-are-the-main-advantages-and-disadvantages-of-ll-and-lr-parsing

--

an argument for block syntax that i don't quite understand:

" munificent 10/21/10 Hi, I'm the author of that post.

On Oct 21, 6:13 am, Corey Thomasson <cthom.li...@gmail.com> wrote: > block arguments for example, are just another form of closures, which go has

I'm aware of that. Wasn't that clear from the article? It was under the "syntax" section and specifically says that the parser will desugar it to a regular anonymous function. The goal here isn't to change semantics, it's to add some syntactic support so that scoped behavior doesn't look so awkward. I think a little syntax can go a long way towards encouraging something to be idiomatic.

--

"

C# pulled it off. There's nothing magical at all about operator overloading, with the exception of the assignment operation, which

--

bwk complains that Pascal http://www.lysator.liu.se/c/bwk-on-pascal.html has too few levels: " while (i <= XMAX) and (x[i] > 0) do ...

...

By the way, the parentheses in this code are mandatory - the language has only four levels of operator precedence, with relationals at the bottom. ... There is a paucity of operators (probably related to the paucity of precedence levels).

"

so maybe we need more than 4 levels but less than 10

list of a bunch of languages and their precedence:

http://rosettacode.org/wiki/Operator_precedence

Go has 6 levels, mb that's good

But i would say has at least 7 level:

Go's levels are:

Precedence Operator 6 all unary operators 5 * / % 1 & &^ 4 + -

^
    3             ==  !=  <  <=  >  >=
    2             &&
    1             ||
    0             things that in Go form statements, not expressions, e.g. ++

in Go, % is remainder, 2 and & and &^ are bitwise stuff,

and ^ are bitwise OR stuff, && is AND and is OR. So the levels can be described as:

6 unary 5 multiplicative and most bitwise 4 additive and OR-ish bitwise 3 comparison 2 AND 1 OR 0 things that in Go form statements, not expressions, e.g. ++

note: things with more levels are not always that bad, e.g. java is very orderly, and also needs more levels because it treats more things as 'operators':

http://introcs.cs.princeton.edu/java/11precedence/

some of the extra things that java deals with as operators are:

unary precedence: [] access array element . access object member () invoke a method

unary precedence, but above the former, also, right to left (everything else mentioned here is left to right unless it says otherwise): () cast new object creation

comparison precedence: instanceof type comparison

lowest precedence (right-to-left): assignment

" Precedence order gone awry. Sometimes the precedence order defined in a language do not conform with mathematical norms. For example, in Microsoft Excel, -a^b is interpreted as (-a)^b instead of -(a^b). So -1^2 is equal to 1 instead of -1, which is the values most mathematicians would expect. Microsoft acknowledges this quirk as a "design choice". One wonders whether the programmer was relying on the C precedence order in which unary operators have higher precedence than binary operators. This rule agrees with mathematical conventions for all C operators, but fails with the addition of the exponentiation operator. Once the order was established in Microsoft Excel 2.0, it could not easily be changed without breaking backward compatibility. "

D has 15 levels, and C++ 17, and C 15 woah:

http://stackoverflow.com/questions/2669153/d-operator-precedence-levels-version-1-0 http://en.cppreference.com/w/cpp/language/operator_precedence http://web.ics.purdue.edu/~cs240/misc/operators.html --

function vs operator precedence in haskell: basically functions bind tighter, but note that they cannot bind to an operator at all (unless it is surrounded by parens):

http://stackoverflow.com/questions/3125395/haskell-operator-vs-function-precedence

Prec- Left associative Non-associative Right associative edence operators operators operators 9 !! . 8 ^, ^^, 7 *, /, `div`, `mod`, `rem`, `quot` 6 +, - 5 :, ++ 4 ==, /=, <, <=, >, >=, `elem`, `notElem` 3 && 2

1 >>, >>= 0 $, $!, `seq`

-- http://www.haskell.org/onlinereport/decls.html#prelude-fixities

--

toread:

http://kevincantu.org/code/operators.html

http://echo.rsmw.net/n00bfaq.html

http://blog.psibi.in/2013/02/operator-precedence-and-associativity.html

https://www.fpcomplete.com/blog/2012/09/ten-things-you-should-know-about-haskell-syntax

--

arguments to annotations should work like arguments to functions, with defaults overridden by keyword args coming after positional args:

e.g. java can't do this but scala and .NET can:

" we can not mix-and-match the two styles in Java:

    @SourceURL(value = "http://coders.com/",
    mail = "support@coders.com")
    public class MyClass extends HisClass ...

Scala provides more flexibility in this respect

    @SourceURL("http://coders.com/",
    mail = "support@coders.com")
    class MyScalaClass ...

This extended syntax is consistent with .NET’s annotations and can accomodate their full capabilites "

-- http://docs.scala-lang.org/tutorials/tour/annotations.html

---

hoon's ?: for 'if' and ?- for 'case' look good

---

in clojure, syntax-quoting:

(defmacro with-open-2 [[r resource] & forms] `(let [~r ~resource] (try ~@forms (finally (.close ~r)))))

(instead of:

(defmacro with-open-1 [[r resource] & forms] (list 'let ;; clojure symbol -- an atom of code! [r resource] (concat (list 'try) forms (list (list 'finally (list '.close r))))))

)

---

nimrod's use of : for blocks is interesting; i guess that's significant indentation though?

---

the syntax of this is interesting to me, particularly the () surrounding a block whose output is desired, and the use of 'as' instead of = to bind a name to the value computed by that block

" select dashboards.name, log_counts.ct from dashboards join ( select dashboard_id, count(distinct user_id) as ct from time_on_site_logs group by dashboard_id ) as log_counts on log_counts.dashboard_id = dashboards.id order by log_counts.ct desc "

---

idea from Hoon:

In Hoon, the +-<> axis limb syntax for nested chains of head/tail is interesting, if perhaps too complicated as syntax ( http://urbit.org/doc/hoon/tut/2/ ). But perhaps we can generalize this anyways?

specify composition of head() and tail() by alternation of single characters, e.g.

++-- might mean "head head tail tail", reading right to left (Hoon works left to right and has a more complicated scheme but i prefer this)

generalize this: allow user to specify custom compositions of arbitrary things (functions? or more general?) with single letters

--

Want to say stuff like

A_(x+1, y) = 2A_(x,y)

--

wambotron 3 hours ago

link

Why use

public string $x = ;

instead of

public $x:string = ;

It seems inconsistent to me, probably because I've used AS3/Haxe.

reply

---

having an indentation error when you cut and paste a single line into an ipython console really gets me:

In [1458]: xx, yy = meshgrid(range(shape(areas)[0]), range(shape(areas)[1])) IndentationError?: unexpected indent (<ipython-input-1458-f399ae675b48>, line 1)

If you want to paste code into IPython, try the %paste and %cpaste magic functions.

--

" Python indexing is done using brackets, so you can see the difference between an indexing operation and a function call. " -- http://lorenabarba.com/blog/why-i-push-for-python/

--

there is something to Hoon's idea of uniformally using prefixes to indicate different types of literals (as opposed to e.g. Python's "3." to indicate "3.0", a floating point literal; but Python's may be easier to learn)

--

there is something to Hoon's idea of providing URL-safe literal syntaxes

--

As a rule of thumb, if grouping punctuation characters can ever appear without spaces separating them from neighboring constructs, then one should never have grouping punctuation (such as '(') double as part of an operator (e.g. '~()').

--

To take a stab at the second question, I think BNF that fits on one page https://docs.python.org/3.4/reference/grammar.html vs http://perldoc.perl.org/perlfaq7.html#Can-I-get-a-BNF%2fyacc...

This basically means Perl is very complex and its grammar can be self contradicting, such that behavior is undefined. C++ has a similar problem to a lesser extent.

reply

riffraff 1 hour ago

link

To expand on the non-syntax, perl has an incredible amount of language-level features, which may appear very weird to people who have only seen it from afar.

For example, perl formats[0] are language-level support for generating formatted text reports and charts, which is basically a whole sublanguage (much like perl regexen).

[0] http://perldoc.perl.org/perlform.html

reply

Demiurge 1 hour ago

link

That's pretty crazy. I used Perl a lot, but haven't seen that feature :)

reply

dragonwriter 29 minutes ago

link

> To take a stab at the second question, I think BNF that fits on one page

Maybe for python, but not for Ruby. Ruby is not particularly simple to parse (though it may be simpler to parse than Perl, and clearly seems to be simpler to implement -- or perhaps its just that more motivation exists to implement it.)

reply

Demiurge 11 minutes ago

link

First google result: http://www.cse.buffalo.edu/~regan/cse305/RubyBNF.pdf

I think 2 pages is not bad :) The point is, Perl is just impossible to formally define, it depends on the implementation to make arbitrary choices. This means multiple implementations are much harder, if possible.

reply

dragonwriter 3 minutes ago

link

> First google result: http://www.cse.buffalo.edu/~regan/cse305/RubyBNF.pdf

Yeah, but its not:

1) One page, or

2) Current (it claims to be for Ruby v1.4), or

3) (apparently, I can't verify this for the version of Ruby it claims to represent) Accurate [1]

[1] http://stackoverflow.com/questions/663027/ruby-grammar

But, yes, Ruby can be parsed independent of being execution, which makes means you can separate the work of a separate implementation into (1) building (or reusing) a parser, and (2) building a system to execute the result of the parsing. Being able to divide the work (and, as a result, to share the first part between different implementations) makes it easier to implement.


stcredzero 1 hour ago

link

When I looked at such things last, Python had about 29 terminals & nonterminals in its grammar. Ruby had 110. (These are numbers I remember from playing with particular parser libraries, so YMMV.) By contrast, a commercial Smalltalk with some syntax extensions had 8. I have no idea about Perl, but I'd guess it's about the same as Ruby.

reply

--

pornel 16 hours ago

link

I hope ideas will flow the other way too, and Rust adopts some sugar from Swift.

I find `if let concrete = optional` sooo much nicer than `match optional { (concrete) => , _ => {} }`.

Rust has solid semantics that covers more than Swift. OTOH Apple has put their UX magic into the language's syntax. Some syntax shortcuts, like `.ShortEnumWhenInContext?` are delightful.

The two combined will be the perfect language ;)

reply

andolanra 10 hours ago

link

You could always write this yourself with a macro:

    macro_rules! if_let {
      ($p:pat = $init:expr in $e:expr) => {
        { match $init { $p => $e,_  => {}, } }
      }
    }
    fn main() {
      let tup = (2, 3);
      if_let!{(2, x) = tup in println!("x={}", x)}; // prints x=3
      if_let!{(5, x) = tup in println!("x={}", x)}; // doesn't print
    }

It's slightly more heavyweight, but still not too bad.

reply

--

xixixao 15 hours ago

link

Funny how this just falls out of JS (in CS):

  if concrete = optional
    call concrete

reply

stormbrew 15 hours ago

link

Pretty much every language to date allows this construct. And it's often considered a bad idea in languages where = is assignment because it's so easy to get it confused with == and accidentally assign to the thing you're trying to compare to, which will usually evaluate to truthy.

The special things about the way it works in Swift are:

In JS, C, CS, Ruby, etc. you're not really doing anything useful if you assign a value to another name just for one branch of an if statement. In Swift you are.

reply

---

Io has a weird syntax where you can put some parameters to a function to the left of it, separated by a space, and others in parens to its right. I find it to be very readable for some things. Presumably it's because these are not functions, but object methods.

Io> s findSeq("is")

> 2

Io> s findSeq("test")

> 10

Io> s slice(10)

> "test"

Io> s slice(2, 10)

> "is is a "

---

i still like the idea of . reversing the ordering.

but if we are doing things in the usual ordering, e.g. f

g x for (f(g(x))), then f.g.x as interepreted by typical languages isn't switching the ordering.

so let's have it be f

g x and x.g.f

or, alternately, use . instead of

(that is, use . as Haskell's $), since is hard to type on android keyboards, except that it still binds very tightly:

f (g (x)) = f.g.x = f.(g x) = f(g(x)) = (in other languages, f[g[x]] )

  in contrast to

f g x, which if like (f(g))(x) or (in other languages, f(g,x))

  but wouldn't it be more convenient to have a looser binding like Haskell's $? e.g. for f . g x to be f(g x) instead of (f(g))(x)? not sure. 

isn't this exactly what haskell does with . anyhow?

--

using the Python style all defaulted params must be named, but the caller can choose to give any argument by name

i guess that's okay

--

mb force named arguments if more than a few parameters (e.g. only permit 3 positional params).

--

hoon has a good idea: urlsafe atom syntax

--

serial/parallel sigils?

--

--

it's desirable to be able to write chains of processing steps from left to right, like Ruby, instead of nested, like Lisp:

[1,2,3].map {

nn*n }.reject {nn%3==1 }
  is better than:

(remove-if (lambda (n) (= (mod n 3) 1)) (mapcar (lambda (n) (* n n)) '(1 2 3)))

but i'd prefer not to have everything OOP like in Ruby, because it seems silly to me that a symmetric two-argument function like addition should be defined in an asymmetric way.

so how could we do that if map and reject were just functions?

you'd just have to have syntax operators that lets you say "take the result on my left, and give it as the first argument to the function on my right". to generalize, let the user choose which argument on the right gets the thing on the left.

perhaps this is how arrows work in Haskell, i'm not sure.

so e.g., using '

' as the operator and '-' to mark where to put the argument, you'd have something like:

[1,2,3]

map - {nn*n }reject - {nn%3==1 }

note the similarity to Unix shell syntax. Why is this longer than the above Ruby code? because we're explicitly specifying at which arguments to put the incoming results. We could say that if no place is specified (by the next pipe), then put it as the first argument:

[1,2,3]

map {nn*n }reject {nn%3==1 }

now, what about Ruby's 'yield'? we don't need 'yield' if we are just passing anonymous functions, we only need it if the block coming in can 'return' in the larger scope. And, to make things as concise as possible, we may as well omit the argument lists in the anonymous lambdas and use special default variables to match positional args:

[1,2,3]

map {$1*$1 }reject {$1%3==1 }

imo that's even easier to read than Ruby!

interesting that this scheme uses two kinds of default variables: the target of the pipe (set by '-', or, by default, the first argument of the function), and the variables for the anonymous lambdas ($1, $2 etc)

note: instead of $1,$2, etc, should we use x,y,z or a,b,c?

--

mb use $, the aliasable variable sigil, for global vars too ruby uses $ for globals, i think

--


Footnotes:

1.

  

2.

 and