ideas-computer-jasper-jasperNotes

---

handy list of symbols convenient for freq usage:

unshifted, double unshifted, shifted, double shifted

`-=()\;',./ `` -- == (( ))
;; ,, .. ~!@#$%^&*[]_+{}

:"<>? ~~ !! @@ ## $$ %% ^^ && [[?]] __ ++ [[image: ?]]:: "" 1 ??

i am wondering which of these are hard to type on non-US keyboards. the second post here http://www.cpptalk.net/5-vt10808.html?postdays=0&postorder=asc&start=60 opines that it would have been better if "I don't know. Had the problem been addressed from the start, if for example, Kernighan and Richie had refused to use any character which wasn't in the invariant part of ISO 646, I think it would have been a good thing. I've had to develop C on terminal which only supported ISO 646-DE. ". A quoted comment on that page also gave some examples of common characters which are hard to type in italy: " I've to admit that it's difficult to find PCs in italy with an US keyboard; looks like italians are not considered as potential programmers (it's hard to type "{") or internet citizens, for that matter (it's hard to type "@" or "~" too, with no standard for it). "

So maybe i should look at ISO 646? according do http://en.wikipedia.org/wiki/ISO/IEC_646 , there is the invariant subset, but there is also T.61, which gives you more punctuation, but leaves out {,~, which the italian guy found hard (but T.61 has @; but i've gotta belive that @ at least will be changing in italy soon tho! that post was from 2004 btw). the punctuation still not in T.61 is: \ ^ ` {} ~

the ones in T.61 but not INV are #$@[]

C deals with this with http://en.wikipedia.org/wiki/C_Trigraph

http://stackoverflow.com/questions/1234582/purpose-of-trigraph-sequences-in-c :

"It may happen that some terminals and/or virtualization doesn't let you access easily to some characters. In my experience the main offender is the tilde. – Francesco Nov 3 at 19:24"

see also http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2910.pdf , although it mostly talks about backwards-compatibility and doesn't give much info useful for someone designing a new language

http://www.wikicreole.org/wiki/Talk.EscapeCharacterProposal says that tilde is difficult on italian and german keyboards

i searched some more but didn't find much else. i guess i'll assume that mainly ~ is the problem. mb curly braces, too.

backslash isn't very common so that just prevents me from treating it like an easily-typed unshifted character.

as for the ones in T.61, the only one of those that i expect to be real common is []. but i cant very well leave out both [] and {}.

Exploring Regularity in Source Code: Software Science and Zipf's Law Hongyu Zhang

lists the most common tokens and identifiers in some java-related situations:

                                          Table 4. Top twelve most common tokens
           Rank    1     2    3      4    5         6    7          8       9         10       11         12
     Jena          ()    .    ;      ,    {}        =    public     new     return    if       +          String
     Tomcat        ()    .    ;      {}   =         ,    public     if      String    null     +          return
     Ant           ()    .     ;     {}   =         ,    public     String  if        new      +          void
     Swing         ()    ;     .     ,    {}        =    if         int     public    return   null       0
     jEdit         ()    ;     .     ,    =         {}   if         int     return    public   new        i
     Jetty         ()    ;     .    =     {}        ,    public     if      String    null     return     import
     jHotdraw      ()    .     ;    {}    ,         =    public     void    int       return   new        if
     DrJava        ()    .     ;    ,     {}        =    public     new     void      String   return     +
     Protégé       ()    ;     .     ,    {}        =    public     return  slot      void     private    String
     Cocoon        .     ()    ;     {}   ,         =    this       String  import    if       org        null
     JavaCC        ()    .     ;     ostr println   =    ,          {}      +         if       i          []
     jUnit         ()    ;     .    ,     {}        =    public     new     void      return   String     0
                                          Table 5. Top ten most common identifiers
    Rank     1        2          3            4        5        6          7       8              9          10Jena         String   i          jena         om       hp       hp1 n       m              node       resource Tomcat       String   i          org          apache   name     log        java    javax          request    append Ant          String   org        apache       tools    ant      i          File    build- java       project Exception Swing        i        g          c            x        y        String     e       java           a          width jEdit        i        String     jEdit        name     buffer   length     log     Object         e          path Jetty        String   i          log          java     org      e          name    IOException    length     mortbay jHotdraw     x        y          draw         r        CH       ifa        point   Figure         java       i DrJava? String   assert- doc          File     cs       edu        rice    i              e          drjava Equals Protégé      slot     String     cls          Slot     i        frame      Cls     Collection     edu        Stanford Cocoon       String   org        apache       cocoon   i        getLogger  java    name           avalon     framework JavaCC? ostr     println    i            0 i        j          String  java           Vector     Options jUnit        String   e          GridBag?- test     Test     i          junit   expected       result     message Costraints

and from "CSteg: Talking in C code"

              Table 1: Frequency of C tokens in cryptographic software.
                   Token type                   Appearance in %
                   Punctuator                               51.59
                   Identifier                               30.02
                   Numerical literal                        11.63
                   Reserved word                             4.77
                   String literal                            1.29
                   Preprocessor directive                     0.7Measures have been made with tools taken from (?). Comments have n ounted. Frequency distribution of C tokens gathered in our tests is descr le 1. Table 2: Freq. of punctuator tokens in analyzed software. Token      Frequency     Token      Frequency , 21.52 -> 2.05 ; 13.21 . 1.82 ( 12 * 1.73 ) 12
1.34
                   =                5.41    #                1.19
                   ]                  4.8   v++              1.11
                   [                  4.8   +                   1
                   {                2.21    *v               0.92
                   }                2.21    Other           11.68Most used punctuator tokens are described in Table 2. Reserved words fre ore homogeneous (Table 3). We have found that inside each group of possible tokens (punctuators and r ds) there are only a few tokens which are commonly used. The rest of the Table 3: Freq. of reserved words in cryptographic software. Word           Frequency      Word       Frequency if                  14.84 static           2.93 int                 13.79 register         2.84 unsigned             9.25 case             2.83 char                 8.84 while            2.60 for                  8.30 break            2.54 void                 5.85 sizeof           1.54 else                 5.09 extern           1.21 return               5.02 short            1.14 long                 3.74 struct           0.98 const                3.49 Other            3.15 6

---

todo, read http://stackoverflow.com/questions/tagged/language-design


newtype vs data with a single strict field: newtype is just type coercion, takes no time at runtime. how to do in jasper? compiler that recognizes when constant fields are only referred to in types?

"new" constructor to construct pattern with constants in pattern? or just constructor?

all caps are keywords (global symbols)

how to simplify things like this (Java Android): ((AlarmManager?)context.getSystemService(Context.ALARM_SERVICE)).cancel(pendingIntentAlarm);

dependent types? i.e. context.getSystemService(Context.ALARM_SERVICE) returns something of type AlarmManager??


in haskell, = is the only symbol needed to defn a fn w/ args:

f x y = exp

in jasper currently, u need : also:

f = x y : exp (or should it be f = y x : exp ?)

should we make it like in haskell, where multiple things on the lhs signifies a fn w/ inputs? or should we save that form for computed lhs's i.e. in assignments? also note that we want the fn name to be on the very left, b/c text editors left justify, as noted earlier. so

f x y = exp is a fn? or

y x f = exp

means "assign exp to the result of y x f"?

and what about graph assignment?

so far i think we should keep it as is

to avoid typing the :, we could make it so that either all of the variables have to be indented, or none of them, and that if there is a single thing at the end indented more than the vars, then there is an implicit colon in between the vars and it. so both of these are equiv to "f = x y : exp":

f = x y exp

f = x y exp

mb the general rule is stated: "after there has been at least one thing after the equals, if there is another thing indented more, then put a : in between them". of course, if there is no other thing indented more, then there is no : at all.

the following is illegal:

f = x exp y

hmm, we should probably turn that around actually, since we dont want the fn body, which is bigger, to be the thing which is indented. so

f = x y exp

is the way to go; and

f = x exp y

or

f = x exp

are both illegal (after something with a lesser indent, i.e. the body, you cannot have something of a greater indent, i.e. a variable


do we want to replace multiplication and exponentiation with something like knuth up arrow notation or the hyperoperator? seem like it would annoy ppl. but so elegant!

also, as noted, could make "-" prefix denote inverse by convention, and have -+ for subtraction, -* for division.


lazy or strict patterns?

strict patterns let you define case-statements in an ad-hoc way for ad-hoc polymorphism. lazy patterns seem more natural, but arent conditional

haskell "pattern bindings" are like "graph assignment". gentle intro points out the need for some laziness there:

" fib@(1:tfib) = 1 : 1 : [ a+b

(a,b) <- zip fib tfib ]

This version of fib has the (small) advantage of not using tail on the right-hand side, since it is available in "destructured" form on the left-hand side as tfib.

[This kind of equation is called a pattern binding because it is a top-level equation in which the entire left-hand side is a pattern; i.e. both fib and tfib become bound within the scope of the declaration.]

Now, using the same reasoning as earlier, we should be led to believe that this program will not generate any output. Curiously, however, it does, and the reason is simple: in Haskell, pattern bindings are assumed to have an implicit ~ in front of them, reflecting the most common behavior expected of pattern bindings, and avoiding some anomalous situations which are beyond the scope of this tutorial. Thus we see that lazy patterns play an important role in Haskell, if only implicitly."


thought about typing syntax from an attempt to write down the abstract data structures used in a todo list program i saw:

activeness = ['active 'inactive]

projectp.actions.# is action actions are in projects categoryp.projects.# is project projects are in categories

projectp.name is str categoryp.name is str projectp.status is activeness categoryp.status is activeness

project = projectp proto category = categoryp proto


is single-line comment

convention: capitalized identifier means the set of things that fit the prototype assigned to the corresponding lowercase identifier

convention: when you say "x is a", this means tell the type system to prove that "x in A"

Activeness = ['active 'inactive] could also have said $'[active, inactive]

project.actions.# is action actions are in projects category.projects.# is project projects are in categories

project.name :: str category.name :: str project.status :: activeness category.status :: activeness


old:

convention: "is" means "isa", and is like : in haskell (mb should be ::)

project.name is str category.name is str project.status is activeness category.status is activeness


want quick way to represent one-to-many relations; equiv to these type statements :

1-to-many relation b/t projects and actions project.actions.# is action action.project is project

1-to-many relation b/t categories and projects category.projects.# is project project.category is category

but also the assumption that:

& a in p.action if hmmm dont i mean "implies" instead of "if"? action.project == p action.project == p if hmmm dont i mean "implies" instead of "if"? a in p.action

hmm, rewrite:

& (a in p.action)- or action.project == p (action.project == p)- or a in p.action

heck, i guess that stuff might be common in assertions. let's have "implies" and "biconditional" logical operators, "->" and "<->" (note: see if this conflicts whith whatever notation we come up with for special uses of '-'; i dont think it does b/c - is only negation at the end):

a in p.action <-> (action.project == p)

anyhow, so need a notation for that, and for 1-1 and many-many. how about

many-1 project.actions action.project 1-1 wife.husband husband.wife many-many fan.idols celebrity.fans

and also variants to indicate if the corresponding sets can be empty:

many-1 project.actions action.project 0 < #project.actions (action.project is a single value of type "project") many-1' project.actions action.project (action.project is a single value of type "project") many'-1' project.actions action.project (action.project is a single value of type "project'") (note: #x = x len)

i guess these are shortcuts for graph pattern assertions augmented with some assertions using "in" (if those aren't allowed anyway; i guess they should be; dude so now graph patterns express FOL and set theory?!? seems too expressive):

many-1 project.actions action.project

-->

project.actions.# is action 0 < #project.actions action.project is project !a in !p.action <-> action.project == !p

many-1' project.actions action.project

-->

project.actions.# is action action.project is project !a in !p.action <-> action.project == !p

many'-1' project.actions action.project

-->

project.actions.# is action action.project' is project' two conflicting uses of ' syntax -- if types are values, then is project' a value which is of type "set of projects, unioned with null", or of type "either null, or a set" -- TODO !a in !p.action <-> action.project == !p which, because action.project is used, not action.project, actually compiles to action.project' -> !a in !p.action <-> action.project == !p

so these many-1 guys are just macros, is that it?

syntax to add: postfix ' on types for nullable type postfix ' on values for a value of a nullable type

  1. alone at end for "len"
  2. prefix for what i used to call @ <->, -> ! prefix for universally quantified variable ? prefix for variable or existentially quantified variable

--

note inspired by the above: are assertions and predicates the same? that is, can the above be interpreted as an assertion? no, something is a predicate if its the last nonindented thing in a definition, but an assertion if it's not the last. it's easy enuf to distinguish booleans being returned from boolean assertions; the last nonindented line, and the lines below it, are "real", the ones above are assertions. of course, an assertion can "call" a predicate, so the same subroutine could be used to calcuate boolean return values and to calculate an assertion's bool, e.g.

greaterthan5 = x : x > 5 plus6 = x : x 6 +

or in point-free notation: greaterthan5 = > 5 plus6 = how to assert something about the result of the fn? how to indicate variables? some alternatives: x plus6 > 5 ?x plus6 > 5 won't this cause a constraint check? or should we disallow constraint satisfaction within assertions? the return value 6 +

or.. plus6 = the return value 6 + ! ?x plus6 > 5 seems redundant; from the use of plus6, we KNOW it's a postcondition

main = 8 greaterthan5 seq 8 plus6 pr

but then how to do assertions within sequences? or assertions that hold during or after the seq? hmmm mb should indicate assertions somehow afterall? ! on every assertion? or only within sequences? or "ass" or "asr" or something? i guess ! would be easiest. at the beginning of the line. optional outside seq?

so far, i think the best option is: !! at beginning of line in sequences, optional without. assertions before seqs mean conditions that hold eternally/persistently. assertions themselves can't have sequences (i.e. they cant do logging) -- but they can do nondeterminism and input i.e. they are not ref trans. ?-variables within assertions don't mean "solve this constraint", they mean, "if you should happen to to apply this fn enuf times to bind these variables, then this assertion must hold" (i.e. assertions are lazy). that is, within normal program logic, ?-variables are implicitly existentially quantified, i.e. ?x f > 5 means "find an x such that f(x) is greater than 5, and set ?x to that", but in an assertion it means "for every value of x, f(x) must be greater than 5" no wait, i was using assertion syntax to write constraints, huh. so how to write lazy assertions, i.e. "?x plus6 > 5"? or should we use syntax to disambiguate constraints and assertions? mb "?x" is existentially quantified x and "!x" is universally quantified x? so

plus6 = !x plus6 > 5 6 +

means "for every x, plus6(x) must be > 5", and

plus6ButEven? = x : !x plus6ButEven? 2 mod == 0 evener in [0 1] x plus6 evener +

but this could be rewritten

plus6ButEven? = x : res 2 mod == 0 evener in [0 1] res = x plus6 evener + res

mb we could provide "res" as a keyword so it would look like:

plus6ButEven? = x : res 2 mod == 0 evener in [0 1] x plus6 evener +

and we have gotten rid of the need for the !x. which is good, b/c on second thought the semantics of "lazy assertions" are confusing. if "3 = 2 1 f" then "f 1" returns a fn which, if given 2, yields 3. But lazy assertions let the fn defn of f make an assertion about the behavior of another function, "1 f". on third thought, there's nothing wrong with this; it lets you say

!y !x const = !x

which is perfectly reasonable

btw, we dont need "let" or "where" b/c:

equations on the same indentation level in the same block are mutually recursive

equations indented below others inherit the bindings of all of their elders. i.e.

x = 3 y = 4 y > x if commendation = 'good' commendation '$$y > $$x' ++ pr

works. (how to write "if"s with many lines in the condition?) (should have a short syntactic way to concat many elements of a list, instead of putting "++" in between things to be appended? i.e. instead of "hello" ++ " " ++ "how" ++ " " ++ "are" + " " ++ "you?" or ["hello" " " "how" " " "are" " " "you?"]

want something like

  "hello" " " "how" " " "are" " " "you?" *+
  (using the syntactic fold convention) (if we use  for single-line comments and # for lists/ranging over numbers, then we could use @ as the syntactic fold prefix instead of *, which is good b/c we want  to be exponentiation) (mb instead of letting users define *X, we just have them define the initial value and make it a true fold? i.e.

+.__init = 0

or mb

+ init = 0

or mb

+ identity = 0

(...but it doesnt have to be an identity; the fold op can have two arguments of diff types, e.g. "push" which takes a list and a list item, where the item doesnt have to be a list itself)

i like "+ init = 0", b/c then there could be an init fn which returns the init of ops, but i think i said something before about what things like "x f = exp" should mean, and i forgot what it was, so i wonder if this is compatible. oh, was it just graph assignment? b/c "+ init" is like "init.+"? but this "graph assignment" is just one value in the lvalue, so it's just normal __set. which is fine, b/c in this case init is just a partially-defined fn no, it's not a normal __set, b/c . wasnt used. that how __set is disambiguated from partially-defined fns.


btw if you have !x and ?x and you mix them in one expr, what order do the quantifiers go in, and where do they appear? mb the graph pattern matching isnt full FOL after all...

--

looking for ways to restrict from FOL.. the many-one stuff can be written in a more graph-patterny way like this:

many-1 project.actions action.project

-->

project.actions.# is action 0 < #project.actions action.project is project project.actions.#.project == project that's confusing; mb project.actions.#.project cycle action.project.actions.?# cycle where ?# is existential, instead of universal, quantification

probably would want to leave out the ?# line, existentials are prob too expressive

another soln: allow some existential-ism, namely allow the assertion that a particular node is in a set. so you can't express "there exists some x in set S such that x.a > 3 and x.b > 4", because you can't just filter the set by an arbitrary condition and then assert that the result is non-empty; you can only filter the set by the condition that it contains a PARTICULAR element

so if a :: action, then 'a.project.actions.?# == a' is allowed; but 'a.project.actions.?#.xyz = 3' is not. the idea being that you can say things like "this has to be a child of that", but you can't otherwise ask the typechecker to iterate through a list and make sure that at least one element satisfies some complex condition.

note: if you allow expression of both sides of the many-1 biconditional, then when you are making a new association between an action and a project, assuming things are not atomic, then there must be a moment when the assertion is violated; i.e. either the project will point to the action first, or vice versa, before you connect them in the other direction. so need a language construct to say, "suspend eternal assertion checking during this block; that is, treat this block as atomic for those purposes". should we just let that be the same as the "atomic" construct for parallelism? i think so, but that'll be annoying for someone who wants to implement their data structures in such a way so that multiple processes coordinate to change structures during a precondition-atomic block (in fact, that will be prohibited). but heck, let's just say that such a person wont be able to use jasper's eternal assertions in that case.


Activeness = ['active 'inactive] could also have said $'[active inactive] or use keywords Activeness = [ACTIVE INACTIVE] could also have said $'[active inactive]

many-1 project.actions action.project many-1 category.projects project.category

action.name :: str action.status :: [CANCELLED NOT_STARTED WAITING_FOR STARTED DONE] action.priority :: [-1 0 1 2] project.name :: str category.name :: str project.status :: activeness category.status :: activeness

todo = allActions

todo [DUEDATE STATUS] sort! general sort, by fn, is sortWith. this is sort by keys; its given a list b/c there can be secondary sorting criteria
(action.status in [NOT_STARTED STARTED]) & (action.project' -> action.project.status == ACTIVE) & (action.category' -> action.category.status == ACTIVE)

in action.status, what is "status"? a keyword (global symbol)? a (local) symbol, taken from the same namespace that "action" was defined in? need fn to convert keywords to local symbols of the same name, so that you can do sort([KEYWORD]), and sort can find the corresponding local symbol in the package that defines the thing being sorted. that fn should also accept already local symbols, and do nothing with them why not allow () to group within [] rather than more complex ways of switching to "code mode"?

y'know, this is all wrong if 'f.x' means f(x), b/c action.status would mean action(status), i.e. status is a variable, not a symbol. soln: 'f..x' means f(x). f.x is short for f..X. (so .. is sorta like indirect addressing in assembly)

list key sort1 list keylist sort list cmpFn sortWith

it should be noted that things like the period in f.x, the exclamation mark in 'list keys sort!', etc are translated to things like %___period, %___in-place-mutate, and then normal macros are given for them '(f x %___period) => f..capitalize(x)', '(* x f %___in-place-mutate) => x = * x f' mb ea. macro rule may have only one macro named in its lhs? to make it easy to determine order of application.

equations are mutually recursive EXCEPT for psuedomutation; if the same lhs occurs in two eqs, then the first one governs above the second one, and the second one governs below it. the first one governs in the rhs of the second one.


single inheritance + super + traits (traits = commutatively summed (order is irrevelant) namespaces), but for MODULES, not objects. but what about data structure definitions? i guess data structures will be accessed via API, so this is a module too. generic function by default, but modules are just namespaces and so can be easily "attached" to data structures if desired. desired: a lightweight syntax for composition-style code reuse that is as lightweight as inheritance (i.e. a lightweight syntax for wrapping). e.g.

if parent class has fn

getX

and child class wants to override, the inheritance way is

getX = return super.getX WRAPPED_STUFF

while the composition way is

getX = return this.parent.getX WRAPPED_STUFF

hmm not so different looking huh? :) the difference is that if super.getX were to refer to a method such as this.y, it would refer to child.y (because the parent object is not separate), whereas with composition it would refer to parent.y. i.e. i think the difference is just the same as the effect of "late binding of self" in prototype languages. i.e. inheritance is actually CODE reuse, b/c it's as if the code of the parent is copied into the child, whereas composition is just the wrapping of an opaque object. so i guess we need to support both?!?

instead of super, can have parametrized modules. i.e.

mod p1 getX = 1 getY = me.getX + 10

mod p2 = getX = 2 getY = me.getX + 10

mod c = parent : getX = 'getX is ' + parent.getX

mod c1 = p1 c mod c2 = p2 c

c1.getX pr prints 'getX is 1' c2.getX pr prints 'getX is 2'

umm, how are those "modules" any different from normal objects?!? mb they arent. that would be nice! so the "mod" keyword just means "within the body of the mod, i'll refer to something in as if it were in the top-level namespace, but you put in as an edge from this object, instead". i.e.

mod p1 = getX = 1 getY = me.getX + 10

is short for

p1 = [] p1.getX = 1 p1.getY = p1.getX + 10

and

mod c = parent : getX = "getX is " + parent.getX

is short for

c = parent : obj = [] obj.getX = "getX is " + parent.getX obj

yeah but this is neither composition nor inheritance; fns in parent which arent explicitly mentioned in c arent in c, e.g. c.getY is an error. So need stuff like:

mod c parent=p1 inheritance mod c wraps=p1 composition

can we just use normal syntax for mod, pls?

c parent=p1 mod c wraps=p1 mod

hmm since we are defining c, mb it would be appropriate to do

c = parent=p1 mod c = wraps=p1 mod

the former would do

mod c = getX = 1 getY = me.getX + 10

i.e. simply copy the code in p1, unless overridden

and the latter (wraps) would do

mod c = c.__super = p1 c.getX = c.__super.getX c.getY = c.__super.getY

example with parameterized inheriting child p1 = mod getX = 1 getY = me.getX + 10

c = parent=? mod

would do (something that cannot be represented without multi-stage constructs b/c the code inserted into c1 depends on p1)

but we can say that

p1 = mod getX = 1 getY = me.getX + 10

c = parent=? mod getX = "getX is " + parent.getX c1 = p1 c

does the same as c = parent=p1 mod, and

p1 = mod getX = 1 getY = me.getX + 10

c = wraps=? mod getX = "getX is " + parent.getX c1 = p1 c

does the same as c = (getX = "getX is " + parent.getX) wraps=p1 mod, which CAN be represented statically w/o meta (excepting the signature property that c1 is guaranteed to have a getY now):

c = super : obj = [] obj.getX = "getX is " + super.getX obj

so i guess mod is a meta-construct (%mod ?!?), and a requirement for the language to be static is that all of its arguments are resolved during non-runtime compilation stages (so, user input isnt required to resolve them)

oh, and if modules are just nodes, then traits are just summing of dictionaries:

c2 = c + trait1

that bit about letting the trait override the superclass, but not letting it override the class, can't work that way, tho. so i guess:

p1 = mod getX = 1 getY = me.getX + 10

trait1 = mod getA = 2

trait2 = mod getB = 3

c = trait1 trait2 parent=? mod getX = "getX is " + parent.getX c1 = p1 c

is how traits should be added (so mod is variadic? should it be *mod? should it be

c = traits=[trait1 trait2] parent=? mod

?)

note: "wraps" is equivalent to inheriting from a (meta-dynamically constructed) wrapper class

note: since a lot of language constructs can be implemented as macros over a smaller language, there should be a table of macros in the compiler/interpreter, and things in that table should be expanded to %___X, i.e. if the table were ['mod'], then instances of token mod would be replaced with %___mod

would be desirable to just make the fns of mod defined in terms of a few generic/common/core ops and meta-ops on nodes. eg commutative +, copy, copy code, non-commutative + (append), etc. mb ++ is commutative plus?

note: 'me' does have a special relationship to mod, but it is ALMOST a special case of parameterized modules:

x = BLAH mod == x (me : (BLAH mod1))

actually its exact?

x = BLAH mod == x = x (me : (BLAH mod1))

so mb

p1 = mod getX = 1 getY = me.getX + 10

trait1 = mod getA = 2

trait2 = mod getB = 3

c = trait1 trait2 parent=? mod getX = "getX is " + parent.getX c1 = p1 c

is the same as

p1 =: me :: getX = 1 getY = getX + 10

trait1 =: me :: getA = 2

trait2 =: me :: getB = 3

cbody = parent me :: getX = "getX is " + parent.getX

c = (p1 + (trait1 ++ trait2) + (p1 cbody)) :. me

(where :: is maplambda, =: is mapbind, :. is a form of mapapply, + is append (non-commutative dictionary addition), ++ is addition (always commutative))

or, "me" could always refer to the immediately enclosing namespace, in which case we have simply

p1 =: getX = 1 getY = getX + 10

trait1 =: getA = 2

trait2 =: getB = 3

cbody = parent :: getX = "getX is " + parent.getX

c = p1 + (trait1 ++ trait2) + (p1 cbody)

note that in this scenario, the ref to p1.getY is late bound -- it implicitly refers to the enclosing namespace, which is p1 where it is defined, but c later on. uncontrolled late binding seems undesirable, so let's reintroduce explicit me:

p1 =: getX = 1 getY = .getX + 10

trait1 =: getA = 2

trait2 =: getB = 3

cbody = parent :: getX = "getX is " + parent.getX

c = p1 + (trait1 ++ trait2) + (p1 cbody)

now, . is short for me.getX (or should it be this.getX?!?), where me is a magic late binding variable.

but if we have any late binding how can we stay static? since p1 contains a ref to me, all uses of p1 must be bottom-up typechecked at compile time to see where is members go anywhere else. in this case, p1 is used on the last line, so the last line must be evaluated at compile time. in other words, all higher-order references to functions containing refs to me must be eval'd at compile time (multi-stage compilation).

wait, that's dumb. if some bloke just says "x = p1; [1 2 3] x.getX map", this shouldn't have to be done at compile time (after all, you might want to map getX over some input). so mb we're going to have to make modules magic after all, or at least module operators. mb the ops with colons are all module ops, which means whatever they do to the me's is done at compile time:

c = p1 :+: (trait1 :++: trait2) :+: (p1 cbody)

heck, let's annotate regions of the source, instead:

% c = p1 + (trait1 ++ trait2) + (p1 cbody)

equivalent to:

% c = p1 + (trait1 ++ trait2) + (p1 cbody)

the convention is that that regions under a % are executed at compile time, before me binding.

mb even better would be to make the % part of the =, meaning, "bind this variable, and also (re)bind 'me' (necessitating compile-time (recursive eager?) execution, assuming this is a static language):

c %= p1 + (trait1 ++ trait2) + (p1 cbody)

this seems like it could come in handy for resolving ambiguities for objects, too, alto im not sure.

possible special self-referential words: me enclosing namespace this ref to current object (should this be the same as me to facilitate .shortcut?) self ?? parent not actually special? super not actually special?


johnwcowen suggests instance vars should be trait-local: http://www.artima.com/forums/flat.jsp?forum=106&thread=246488 . it's a nice idea, but i don't think i'll do that -- accessing an instance variable will already be interchangable with accessing getters and setters, so the instance variable can be regarded as just an API when it's called

let's make implementations automatically (by default) create interfaces -- no need for an "auto" keyword.

"Now finally there is Go. Go is a new language which was released this week by Google; it was designed by old-timers of the caliber of Rob Pike and Ken Thompson, so I decided to take it very seriously and to have a look at it. It turns out that Go lacks inheritance and it has something similar to the kind of interfaces I had in mind for all this time. I do not need to write my paper about interfaces vs inheritance anymore: just look at Go documentation!" -- http://www.artima.com/weblogs/viewpost.jsp?thread=274019

scala's notation for anonymous structural types (patterns) uses {}, just like i had been thinking of doing: http://programming-scala.labs.oreilly.com/ch04.html " type Observer = { def receiveUpdate(subject: Any) } "

i might translate that to:

observer = {receiveUpdate {subject {Any} : nil} }

(this uses {} alot tho, which is supposedly one of those hard-to-type things on foreign keyboards..)

if empty colons are nil, and if Any is assumed, then

observer = {receiveUpdate {subject :} }

(this means: observer is a structural type (a pattern on graphs). the pattern is: the root node has a child called "receiveUpdate", and that child is a function which has one input named "subject" (whose type is unconstrained), and the function returns nothing (void))


french keyboard notes

\ is shifted

double shifted: ~#{}[]

{}[] double shifted in german too

{}[] is worse than <>

most easy: !:;,*, and then &l"()-_=

shifted: . numbers <> ?


does the whole loose coupling between interface types and implementation types require whole program analysis (i.e. as opposed to shared libraries)?


need syntax to define an interface pattern via a prototype. for example, in scala:

  package ui2
  trait Clickable {
    def click()
  }

in jasper terms, if you have a node

mod m f = x : x+3 g = 4

then you need to be able to construct a type that is something like "the set of all things with a child f, t \in Num -> f : t -> t, and a child g, g : Int".

probs: why do you say g : Int? g is also the type 4, for example. also, is the explicit parametric polymorphism appropriate, or do we just want to say that we take Num and return Num? the difference is that, what if someone sends us something that is Printable and also Num, and we just send back a Num? the parametric polymorphism declares that if we get something that is Printable and Num, we'll send back something that is Printable and Num. The other way (just returning something that is guaranteed to be a Num) doesn't. But, if types are just sets, and, "4", say, is a type, then since we don't have "principal types", it seems like parametric poly doesn't make sense; because the only thing that has ALL the predicates of a given value is that value itself. but so how do we express that we intend to keep Printable? This really seems to be a function of what operations we perform on the input in order to get the output.


put parens around multiline statements to make blocks? parens the same as blocks?

should ((x = 1; (x pr)); (x = 2; x pr)) seq print 1 2 ?

---

how to mix "separation of concerns" as in AOP with inheritance?

---

restricted value-dependent types; graph matching; after terminator, (constant) value lookup table? but need to combine info... so actually, one lookup table; lhs is graphs to be matched, rhs is types, no terminator needed; on r hs, can include types not just values; but what if want to match a type value? need some form of escaping this is a general problem for graph matching, actually

---

gussing on what i meant by "actually, any attached opening paired delimiter can mean to omit the other member? []... bad? {}... good?"

e.g. instead of

(x :: Int),

(x }{ Int)

mb better if

(x {{ Int)

mb that's bad for foreign keyboards, tho


arbitary ;;;;s for sep diff dimensions of hyperrectangles (n-ary arrays).

2d arrays of code replaces blocks; asserts

naw, let's make 2nd (vertical?) direction of code mean parallel composition and make asserts manul

but why not assign sematics to more dimensions? 3rd dimension could be interpretive meta, for example..

hmm... seems like vertical should not mean parallel composition... ';' should just mean "composition", which can be either seq or par (different modes of composition), and in which all/at least one/exactly one statement is guaranteed to be executed (another dimension of mode).

following the matrix idea, tho, vert could mean "vector output", i.e. when you want to collect not just one result, but multiple results -- in contrast to horizontal composition, which, in analogy to addition in matrix multiplication, means a bunch of statements which get combined together somehow to produce only one output.

why not allow () to group within [] rather than switching to "code mode"?

if do keywords the old way, then: need fn to convert keywords to local symbols of the same name, so that you can do sort([KEYWORD]), and sort can find the corresponding local symbol in the package that defines the thing being sorted. that fn should also accept already local symbols, and do nothing with them

--- warning might be a branch:

Activeness = ['active 'inactive] could also have said $'[active inactive] or use keywords Activeness = [ACTIVE INACTIVE] could also have said $'[active inactive] Activeness2 = [CANCELLED NOT_STARTED WAITING_FOR STARTED DONE]

many'-1' project.actions action.project many'-1' category.projects project.category

action.name :: str action.status :: Activeness2 action.priority :: [-1 0 1 2] action.note :: str action.duration :: duration action.inactiveUntil :: date

action.dueDate :: date' action.alarm :: date' action.context :: context [] action.contacts :: contact [] action.repeatDuration :: duration' project.name :: str category.name :: str project.status :: Activeness2 project.nextAction :: action' project.inactiveUntil :: date project.dueDate :: date' project.nextReviewDate :: date' project.note :: str
SOMEDAY before this, action doesn't need to be thought about
SOMEDAY

category.status :: activeness context.name :: str [] context.location :: geolocation

todo = allActions todo.sort([DUEDATE STATUS])! general sort, by fn, is sortWith. this is sort by keys; its given a list b/c there can be secondary sorting criteria
is incorrect syntax, that's OR
(action.status in [NOT_STARTED STARTED]) & (action.'project -> action.project.status == ACTIVE) & (action.'category -> action.category.status == ACTIVE) & (action.inactiveUntilDate <= now)

durationSum = todo (.duration) map sum durationPriority2Sum = (.priority == 2) filter todo (.duration) map sum durationPriority2Sum = (.priority == 1) filter todo (.duration) map sum durationPriority2Sum = (.priority == 0) filter todo (.duration) map sum durationPriority012Sum = (.priority >= 0) filter todo (.duration) map sum

todo (.tillDueDate = .dueDate - now) map !

waitingForList = allAction

(action.status == WAITING_FOR)

somedayProjects = projects (.inactiveUntil == SOMEDAY) map somedayActions = projects (.inactiveUntil == SOMEDAY) map

nextActions = (.nextAction) allProjects map uniq nonil

in action.status, what is "status"? a keyword (global symbol)? a (local) symbol, taken from the same namespace that "action" was defined in?

/ note: (.x == 5) => (?.x == 5)

no, stick with .x as it looks

?.x should maybe be layered, ie (?.x == 5) => (v1 : (v1.x == 5)) immediately, not to the whole expression as stated before. mb ??.x penetrates two layers of parens. if ?.x is used twice in scope, is it different vars or the same or a syntax error?


lua field access delegation

read/watch on youtube ~/papers/lua-100310-slides.pdf http://www.stanford.edu/class/ee380/


want to be able to make cylinders, toruses, etc out of n-d arrays

concept of internal vs external reference (only ext is dangerous). like enclosing an expression in a layer of lambda, you can enclose a data structure (or "part" of one, which is a data struct in itself) in a "reference boundary" that defines refs inside that boundary that point to other things inside of it as internal.

e.g. a circular linked list can be implemented using references. but, since no refs go outside, these are internal, from the perspective of the boundary of the list. from the perspective of a single element of the list, however, they are external.

in this context a "volume" is analogous to a set of data elements. a "surface" might be analogous to a basis set, that is, a minimal set of elements such that if you follow all of the links of those elements, you get the whole volume. so, for example, the "inside" of the circular linked list may be defined by enumerating its volume, or alternately by giving any single element in the list as the surface. hmm, no, i think another characteristic of a surface is that it be referenced from outside the data object.

so if i say,

c_0 = 10 circular_linked_list circular_linked_list takes a length and returns element 0 of the new list c_2 = c_0.nxt.nxt a.0 = c_0 a.1 = c_2

then from the perspective of a, the surface of the circular list has two elements, c_0 and c_2.

i'm not sure this "surface" concept is useful, at least for local data manipulation. in the real world, things not on the surface are universally inaccessible. but in data, any data object can hold a reference to anything "inside" another data object. so it's all volume, no surface. a real surface would be something like an SQL cursor, or a network connection to a remote object -- i hold a proxy object that makes me think i have direct access to the data in the database, but i really don't. due to the possibility of proxy objects, i can LOCALLY structure my program as if there were no surface, but globally i will have to take into account the presence of the surface.

so if it's a volume, all we can say is: references from c_0 to c_1 aren't dangerous if both c_0 and c_1 are in the same volume.

so "surface" is really a topological relationship, combined with object perspective, meaning "if you want to go from a point in the volume of one object to a point in the volume of a 'different' object, you will have to go through some point in the surface between those two objects"

so, do we want each data element to natively belong to just one "innermost" volume, or can we have multiple viewpoints about which points are grouped together into volumes?

my current inclination is to have data elements natively belong to just one volume, but to have this association belong to variables, not to the data elements themselves. so, each variable carries not only a value and a type, but also a topology which defines: the innermost volume associated with the value held by this variable all innermost subvolumes of that volume

so, for example, if we have a circular list, and c_0 holds an element of that list, then c_0's innermost volume could be, among other things: * c_0 itself, and c_0.0 * all of the elements of this circular list, and their values * a selection of the elements of this circular list, and their values * all of the elements of this circular list, and their values, and all of the elements of another circular list which may or may not be referenced by some of the values of this circular list

etc

the point of all this is to make it convenient to think about properties of the program. things in the "same" volume, from the perspective of the current expression, may be treated as referentially transparent. are there some things which are "really" not referentially transparent, and hence may not be placed into the same volume? yes, but the programming language is not allowed to enforce this. for example, in order to have true proxy objects, you need to allow the programmer to say, "here is a data structure imitating a list whose getters and setters, in reality, send and receive across a network connection -- but i want you to pretend it's just like a normal list", which implies that the language should treat it as ref trans (and can cache gets from it, etc).

when you put two things into the same volume, it implies a guarantee that they are ref trans (except for exceptions -- i.e. that list proxy object is allowed to throw an exception if it can't contact the host).

of course abstractions are leaky, and ways should be provided to access the underlying reality, also (i.e. to consider a more separate topology).

note that in some sense the terminators (barriers, mb i should call them) are dual to volumes -- indeed, barriers are a way to define a surface. you can either explicitly specify the volumes and ask what are the surfaces, or you can explicitly specify the surfaces and ask what are the volumes. fns should be provided to do this.

if (as i'm thinking), the language syntax requires a special marker for references/non ref trans things, then that marker is only required at surfaces, not within volumes -- according to the currently scoped topology (i.e. the topology associated with the variable in question, i guess)

e.g. if the ref marker is prefix _, then

how to declare param poly? assume this is a circular linked list with element value type int' circular_linked_list_3 = i forgot how to use prototypes in data declarations here.. lnode.next :: _lnode lnode.0 :: int'

  //note: lnode.0 is automatically made part of the same volume as lnode, b/c it's not a reference
  // how to declare multiple symbols to be of the same type compactly? mb ::- (:: map) ?
  ./
  c_0 :: lnode
  c_1 :: lnode
  c_2 :: lnode
  /.
  [c_0 c_1 c_2] ::- lnode
  ./
  c_0._next _= c_1   // underscores needed b/c these arent in the same volume yet
  c_1._next _= c_2     // todo: are we allowed to do ref ops at all on things in the same vol? i think not. how to store the "lower level of abstraction" for possible use later?
  c_2._next _= c_0
  /.
  // using underscores twice is confusing. should either be C semantics (a reference is a type of value) or Python (assigning to a reference type is implicitly making a link). i guess Python is simpler.
     // no, changed my mind. the underscored _next reminds you that this is a reference field. the _= makes the "making a link" part explicit. for example, you could do
       // c_0._next = c_1._next
     // which is a normal assignment, not a ref creation
         // (without C semantics, tho, there's no way to make a handle..)
         //  (sure there is, just use 2 underscores)
         // (but when deref'ing, it derefs all the way..
         //  (type system chooses how much to deref)
         // (mb ambiguity in certain cases)
         //  (so be it)
         // we could explicitly deref by removing _s
         //  but then we lose the use of _ as a reminder that this is a deref'd ref, not a normal value
       // also, mb should be c_0._next = c_1 _
         // i know, make the __s go backwards when derefing, i.e. c_0.next refers to the reference itself, and c_0._next refers to the deref'd value. in C, if C had an attached/unattached distinction, then instead of "addr = & val; val = * addr;" we would do "addr = & val; val = &addr" -- it's easier to see in jasper b/c _ prefix is deref, and on its own, i.e. to the right, is ref. i.e. addr = val _, and val = _addr.
  c_0._next = c_1 _  // underscores needed b/c these arent in the same volume yet
  c_1._next = c_2 _    // todo: are we allowed to do ref ops at all on things in the same vol? i think not. how to store the "lower level of abstraction" for possible use later?
  c_2._next = c_0 _
  // how to specify vols?
  [c_0 c_1 c_2] vol
     // specifies this volume as the "innermost" one for all the vars in the list
     // this volume includes the subvolumes, not just the points c_0 c_1 c_2 themselves.. so c_0.0, c_1.0, c_2.0 are included
  c_0     

c_0 = circular_linked_list_3 c_0 = 0 c_0.next = 1 would have had to be c_0._next, except that c_0 and c_1 are in the same volume c_0.next.next = 2 c_0.next.next.next == c_0 i guess the language should be smart enough not to recurse infintely here...

note: in order to make it easy to guess how much to deref, mb a constraint on volumes is that they are not allowed to contain points that are themselves the references to other points in the same volume. so, if you have "ptr = val _; handle = ptr _;", you can't do "[val ptr handle] vol", because then would "handle" refer to ptr or to val? i guess this should be a type constraint b/c we want it to be compile time, not runtime. so, the constraint it that you can't have things that can even potentially hold a ref to something else in the volume. so "val :: valT; ptrT = _valT;

note: is ptrT = _valT the right way to do it?

mb when sets are acting as types, they should be capitalized after all? so "PtrT? = _ValT?"? this also solves the prob with types as types vs. types as vals in the graph pattern matching. or, could allow (or even require) them to NOT be capped right after ::... hmmm.. mb this even provides a good notation for type prototypes!

as i was saying, so "val :: ValT?; ptr :: _ValT?; [val ptr] vol" is illegal, because if you did that, what if you had "ValT?.next :: _ValT?;", and then, later in the code, you said "val.next". since val._next might be holding a "ptr", or might be holding "val _" (or mb "ptr = val _: val.next = ptr") -- so does "val.next" mean the reference (since "ptr" is in the volume) or does it mean the dereferenced value? you want to make this choice at compile time, so that you can compile out the vols and add explicit dereferencing. for this reason, volumes can't contain things that could point to other things in the volume.

btw, language should detect when, e.g. indices arent being used, and use a set instead of a list -- but then mb set ops must be primitive? so be it

i guess the idea is that the "vol" directive specifies a subset of the data where python-like conventions hold, for now -- implicit references (although it's not as simple as that b/c WHERE the refs are implicit is defined by code, not by the language).

btw, if you wanted to define volumes by giving the surfaces, another thing you could do is give the internal links (i.e. explicitly define non-surfaces and let the external surface be defined by exclusion). i.e. to say, "this link is an implicit reference -- this one is too -- any others are explicit".

so, for example, you could say,

circular_linked_list_3 = Lnode.next :: $_Lnode reverses the _ convention Lnode.0 :: int'

  [c_0 c_1 c_2] ::- lnode
  c_0._next = c_1 _ 
  c_1._next = c_2 _ 
  c_2._next = c_0 _
  c_0     

c_0 = circular_linked_list_3 c_0 = 0 c_0.next = 1 c_0.next.next = 2 c_0.next.next.next == c_0

i dont think this is the way to go. it makes the syntactic transformation more understandable, but it defeats the primary purpose of all this, which is to have the compiler help keep track of which references are so safe that they can be implicit -- you could easily use one of these implicit links to link to something that you think is safe but isnt. but when you explicitly state the targets which are safe, rather than the links, this puts things more explicitly (because it's not the link which determines danger, it's the target).

note: although the volumes in use go with the variables, the points in the vol are actual data points.

btw let me give the above example w/o the comment cruft:

circular_linked_list_3 = lnode.next :: _Lnode lnode.0 :: Int'

  [c_0 c_1 c_2] ::- lnode
  c_0._next = c_1 _
  c_1._next = c_2 _
  c_2._next = c_0 _
  [c_0 c_1 c_2] vol
  c_0     

c_0 = circular_linked_list_3 c_0 = 0 c_0.next = 1 c_0.next.next = 2 c_0.next.next.next == c_0

note: all this stuff about volumes should be made to fit with the "owner" ideas of http://bartoszmilewski.wordpress.com/2009/05/26/race-free-multithreading/

(p.s. remember to read the rest of that blog too, it's good)

also we need a way to specify "unique" objects, i.e. ones that only have one reference pointing to them

:= for mv like in that guy's blog? or mb uniq objs should just be moved automatically, so that library writers dont have to think about it?

i guess references are always nullable?

remember -- all fields are implicitly handled by getters and setters -- there is no distinction


see Self:notes-computer-programming-typecheckedRaceFreeMultithreadingViaOwners for notes on http://bartoszmilewski.wordpress.com/2009/05/26/race-free-multithreading/


"objects" (obj), different from OOP objects, a concept of grouping of data points, marked by barriers, that uses three ideas:

e.g. the circular linked list above could be one 'object'. this means:

todo: how does this fit in with composition, delegation, late binding of self, traits, instance-local fields, single inheritance, implementations, views, constructors?

some rough ideas:


nullable instances (nil is a type that applies to every node with the special __isNil attribute set to t)

this way, partially constructed objects can be nil


note: when completing the construction of an object, you need a write fence. when querying it, you need a read fence.

http://bartoszmilewski.wordpress.com/2008/08/04/multicores-and-publication-safety/

note: if you use monitors/locks or java volatile or atomics, the lower-level fence constructs are built-in


note: if something can be passed something unique, then it has to have the property that when it copies something, the old copy can be replaced with nil. i wanted to automatically replace assignment with move for unique things, but i guess that's fine, but that the ability for a function to be uniquified is an attribute that some functions will have and others won't (and some will compile as if it will work, but it doesn't, and some will not compile even tho it would). so mb __uniquable should be a property of function input slots??

otoh mb __uniqable should be the default -- when you assign refs, you copy the value it points to unless you use the special reference operators, so only when using those can you copy a reference. unless, of course, the value you are blindly manipulating is itself a reference, in which case you can copy it without realizing you are aliasing a reference. so when you allow a reference to be manipulated as an ordinary value, uniquness should not be a default anymore, because library functions will want to copy values sometimes.

there's really no good solution, i guess -- uncopyable values really go against the core semantics of computation, so you can't expect to use them interoperably with naive code without pain.

--- layers of referential transparency:

say you have a container that acts like a referentially transparent array, but internally is implemented with a cache. on one level of abstraction, it is referentially transparent; but on another, it is not. another example would be using an FFI (foreign function interface) from a language that supports referential transparency to one that does not. another example would be RPC/RMI (remote procedure call/remote method invocation; calling a subroutine that is actually, although transparently, executed over the network).

the language should support explicit demarcation of such "layers", so that the programmer can offer things like proxy objects that claim to be referentially transparent (except for exceptions).


conceptual primitives needed to talk about [1]:

im trying to think of what sort of metalanguage primitives would allow you to express something like milewski's system without hardcoding the whole thing into the core language (not b/c i dont like it enuf, just b/c i want to keep the core language small).


http://bartoszmilewski.wordpress.com/2009/08/19/the-anatomy-of-reference-counting/

" What is there to reference counting that is not obvious? In any language that supports deterministic destruction and the overloading of the copy constructor and the assignment operator it should be trivial. Or so I though until I decided to implement a simple ref-counted thread handle in D. Two problems popped up:

   1. How does reference counting interact with garbage collection?
   2. How to avoid data races in a multithreaded environment?

In purely garbage-collected languages, like Java, you don’t implement reference counting, period. Which is pretty bad, if you ask me. GC is great at managing memory, but not so good at managing other resources. When a program runs out of memory, it forces a collection and reclaims unused memory. When it runs out of, say, system thread handles, it doesn’t reclaim unused handles–it just dies. You can’t use GC to manage system handles. So, as far as system resources go, Java forces the programmer to use the moral equivalent of C’s malloc and free. The programmer must free the resources explicitly.

In C++ you have std::shared_ptr for all your reference-counting needs, but you don’t have garbage collection for memory–at least not yet. (There is also the Microsoft’s C++/CLI which mixes the two systems.)

D offers the best of both worlds: GC and deterministic destruction. So let’s use GC to manage memory and reference counting (or other policies, like uniqueness) to manage other limited resources. "


named multiple return values (like Octave)

mb some sugar/convention for non-error return statuses? like "status" or "ret(urn" or "res(ult)" or r or s?


remember, important to have interfaces declarable after-the-fact. comment on Go:

"Anything which implements those methods implements the interface. Even if the interface was defined later than a type, in a different module, compiled separately, if the object implements the methods named in the interface, then it implements the interface." -- http://scienceblogs.com/goodmath/2009/11/googles_new_language_go.php

of course, u can do this in other languges too, haskell interfaces, perl6 roles (i think), etc


resumable exceptions seem to be called "condition handling" with "restart", not "resume"


http://bartoszmilewski.wordpress.com/2009/09/01/spawning-a-thread-the-d-way/

" How do you build unit tests whose compilation should fail? Well, D has a trick for that (ignore the ugly syntax):

void fo(Object o) {} assert (!__traits(compiles, (Object o) { return spawn(&fo, o); }));

This code asserts that the function literal (a lambda),

(Object o){ return spawn(&fo, o); }

does not compile with the thread function fo. Now that’s one useful construct worth remembering! "


http://bartoszmilewski.wordpress.com/2009/09/01/spawning-a-thread-the-d-way/

milewski's spawn: " * spawn should take an arbitrary function as the main argument. It should refuse (at compile time) delegates or closures, which would introduce back-door sharing. (This might be relaxed later as we gain experience in controlling the sharing.) * It should take a variable number of arguments of the types compatible with those of the function parameters. It should detect type mismatches at compile time. * It should refuse the types of arguments that are prone to introducing data races. For now, I’ll allow only value types, immutable types, and explicitly shared types (shared is a type modifier in D).

"


http://bartoszmilewski.wordpress.com/2009/09/01/spawning-a-thread-the-d-way/

milewski's spawn implementation:

" Without further ado, I present you with the implementation of spawn that passes all the above tests (and more):

Tid spawn(T...)(void function(T) fp, T args) if (isLocalMsgTypes!(T)) { return core.thread.spawn( (){ fp(args); }); }

This attractively terse code uses quite a handful of D features, so let me first read it out loud for kicks:

          (){ fp(args); }
      which captures local variables, args. 

As you may guess, the newly spawned thread runs the closure, so it has access to captured args from the original thread. In general, that’s a recipe for a data race. What saves the day is the predicate isLocalMsgTypes, which defines what types are safe to pass as inter-thread messages.

Note the important point: there should be no difference between the constraints imposed on the types of parameters passed to spawn and the types of messages that can be sent to a thread. You can think of spawn parameters as initial messages sent to a nascent thread. As I said before, message types include value types, immutable types and shared types (no support for unique types yet).

Useful D features

Let me explain some of D novelties I used in the definition of spawn.

A function with two sets of parenthesized parameters is automatically a template–the first set are template parameters, the second, runtime parameters. -Tuples

Type tuples, like T…, represent arbitrary lists of types. Similar constructs have also been introduced in C++0x, presumably under pressure from Boost, to replace the unmanageably complex type lists.

What are the things that you can do with a type-tuple in D? You can retrieve its length (T.length), access its elements by index, or slice it; all at compile time. You can also define a variable-argument-list function, like spawn and use one symbol for a whole list of arguments, as in T args:

Tid spawn(T...)(void function(T) fp, T args)

Now let’s go back to my test:

Tid tid = spawn(&f, 2, s, "hello");

I spawn a thread to execute a function of three arguments, void f(int i, S s, string str). The spawn template is instantiated with a type tuple (int, S, string). At compile time, this tuple is successfully tested by the predicate isLocalMsgTypes. The actual arguments to spawn, besides the pointer to function, are (2, s, “hello”), which indeed are of correct types. They appear inside spawn under the collective name, args. They are then used as a collective argument to fp inside the closure, (){ fp(args); }. -Closures

The closure captures the arguments to spawn. It is then passed to the internal function (not a template anymore),

core.thread.spawn(void delegate() dlg)

When the new thread is created, it calls the closure dlg, which calls fp with the captured arguments. At that point, the value arguments, i and s are copied, along with the shallow part of the string, str. The deep part of the string, the buffer, is not copied–and for a good reason too– it is immutable, so it can safely be read concurrently. At that point, the thread function is free to use those arguments without worrying about races. -Restricted Templates

The if statement before the body of a template is D’s response to C++0x DOA concepts (yes, after years of design discussions, concepts were finally killed with extreme prejudice).

if (isLocalMsgTypes!(T))

The if is used to create “restricted templates”. It contains a logical compile-time expression that is checked before the template is instantiated. If the expression is false, the template doesn’t match and you get a compile error. Notice that template restrictions not only produce better error messages, but can also impose restrictions that are otherwise impossible or very hard to enforce. Without the restriction, spawn could be called with an unsuitable type, e.g. an Object not declared as shared and the compiler wouldn’t even blink.

(I will talk about template restrictions and templates in general in a future blog.) –Message Types

Besides values, we may also pass to spawn objects that are declared as immutable or shared (in fact, we may pass them inside values as well). In D, shared objects are supposed to provide their own synchronization–their methods must either be synchronized or lock free. An example of a shared object that you’d want to pass to spawn is a message queue–to be shared between the parent thread and the spawned thread.

You might remember that my race-free type system proposal included unique types, which would be great for message passing, and consequently as arguments to spawn (there is a uniqueness proposal for Scala, and there’s the Kilim message-passing system for Java based on unique types). Unfortunately, unique types won’t be available in D2. Instead some kind of specialized Unique library classes might be defined for that purpose.

"


C++0x initialization:

http://en.wikipedia.org/wiki/C%2B%2B0x

" struct BasicStruct? { int x; double y; };

struct AltStruct? { AltStruct?(int x, double y) : x_{x}, y_{y} {}

private: int x_; double y_; };

BasicStruct? var1{5, 3.2}; AltStruct? var2{2, 4.3}; "

---

raw string literal with arbitrary delimiter. e.g. in C++0x, R"delimiter[The String Data \ Stuff " ]delimiter"


y'know, adding an interface implementation is kinda like adding a trait... or rather, like telling the compiler it can auto-add that trait when that interface is called for on an object with the prerequisite interfaces.


uniform syntax for implicitly passing "self", like in python? more general than classes?


uniform syntax for passing normal arguments and "type arguments", i.e. parametric polymorphism?

syntax would include way to add initial args to a bunch of things at once similar to the superclass argument adder discussed previously


although multiple named outputs are easily marked on the last line of the function, mb they should be marked on the first line, instead, so as to make the "output signature" next to the "input signature" for ez reading. in fact, mb just have the return spec be the first line rather than the last. in this case, the indented block is taken as sort of a "let" (unless we're in imperative/monadic mode).


> is, in general, for saying things like:

  premise1, premise2 => conclusion1

in other words, this puts a multiarc whose srcs are nodes premise1 and premise2 and whose dest is node conclusion1, with the understanding that this will be used in some procedure that when some activation value is present in both of the sources, the multiarc will activate and put some activation value into the dest.

so, should mb upgrade the graph type to hypergraph

note that this is just like petri nets, where the activation condition is one or more tokens, the output activation value is one token, and instead of talking of a single hyperarc it talks of a single transition with multiple input and output arcs.

proof nets, combinatorial proofs?

syntax for hypergraph arc follow:

in general [p1, p2]."infers" means the set of destination nodes for the multiarc whose sources are p1 and p2 and whose label is "infers".

m.b. => has an implicit label (=>_infers is an example of making this explicit)? also, note that => is arc creation. so, in a context where => is short for =>_infers, we have

  ([p1, p2] => c1)   ==   [p1, p2]."infers" = c1
                     ==   "infers"=c1 [p1, p2] ins

but what sort of object is [p1, p2]? it's not temporary -- is it dynamically created?

x.a is short for [x].a, i guess but then how to disambiguate x.a when x contains a list? one way is to introduce "singular" vs. "plural" context, and (unlike Perl, where context is determined by the type expected upon output) determine context syntactically based on whether the input is a list or not. recall that the @ operator takes the list out of a variable and makes it as if the list was typed right there. so:

    x.a   --> singular context, short for [x].a
    @x.a  --> plural context  (not short for anything)
    [x y z].a --> plural context  (not short for anything)
    ([x y z]).a --> singular context, short for [([x y z])].a
    functions that distinguish b/t singular vs. plural context always work just like this  -- singular things are implicitly turned into single-element lists before feeding them into the function

should we have a way to do the opposite, functions which have some arguments with a "singular" preference, so that they automatically map if the argument given is syntactically plural? naw, this sounds too hard to keep track of, and we are already going to have a lightweight mapping symbol

mb when defining a fn, use an annotation, ^s, to mark a plural context:

  lookup = arcsrc^s arclabel : arcsrc.arclabel

--- y'know, mb we should put the mapping symbol on a specific ARGUMENT, not on the controlling function. putting it on multiple args could mean pointwise, or it could mean first map one and then map the other, e.g. if the symbol were -,

[2 3]- 5 * == [10 15]

[2 3]- [5 7]- * == [10 21] OR [2 3]- [5 7]- * == [[10 21] [14 21]]

or mb let # of symbols determine; e.g. [2 3]- [5 7]- * == [10 21] [2 3]-- [5 7]- * == [[10 21] [14 21]]

---

library fn to flatten the top n levels of lists; ie. [[1 2 3] [4 5 6]] 1 flatn == [1 2 3 4 5 6] (what happens if it gets [[1 2 3] 4 5 6]? i guess it throws an error -- unless you give it a flag that says that's okay: [[1 2 3] 4 5 6] 1 uniform=f flatn

---

xri, xdi

http://en.wikipedia.org/wiki/Extensible_Resource_Identifier

--- remember basic http, webdav, xanadu, rdf, owl, xri ops:

create delete read write move search

domain range subset

isa has part (meronym) e.g. car has an engine has attribute e.g. hair has a color

action computation

type (data model) format (representation)

variable

also:

entity/agent

subject object predicate relation verb noun adjective adverb preposition sender receiver message reference

versioning sync access permissions

kant's categories

http://wiki.oasis-open.org/xdi/XdiRdfModel looks pretty lightweight/good


need a way to condition code on if a variable is of a given type (at compile time) (like C++ type traits in templates) umm, isnt this just ad-hoc interface definition?


generalized concept of execution times:

compile time lazy runtime strict runtime lazy async (future/promise?) + multistage


generalized MRO + include the idea in that semantic web book


multiple aliases is the prob copying a ref is the critical action


UML class diagrams:

http://edn.embarcadero.com/article/31863


no classes per se, but "ISA" relation. unifies instance/class and value/type


goal to be a successor to Lisp, the "only computer language that is beautiful."


UML fork and join (join = 'barrier') are perhaps the important "synchronization" primitives for explicit high-level code parallelism


code vs. data parallelism


visibility (public vs. private): compile time vs. implementation.

private compile-time: only things of that same class can access it private implementation: different objects cannot access it, even if they are of the same class

public, protected, private, private implementation, module


uml 2.0 std properties:

{readOnly}, {union}, {subsets property-name}, {redefines property-name}, {ordered}, {bag}, {seq} ({sequence}), {composite}

-- http://www.holub.com/goodies/uml/

 (some of?) these std props can also be constraints on relationships (ordered, etc)

in UML, why is there an 'aggregation' relation? is this different from an 'association' with a cardinality that may go above 1? does 'aggregation' just imply that one thing is 'a part of' another?


in UML, the roles and cardinalities next to each entity are the roles and cardinality of that entity, i.e. if it's many of A to 1 of B, "many" is next to A and "1" is next to B.


according to http://www.holub.com/goodies/uml/:

" Aggregation (comprises) relationship relationship.1 Destroying the "whole" does not destroy the parts.

Composition (has) relationship.1 The parts are destroyed along with the "whole." "

and

" (1) Composition vs. Aggregation: Neither "aggregation" nor "composition" really have direct analogs in many languages (Java, for example).

An "aggregate" represents a whole that comprises various parts; so, a Committee is an aggregate of its Members. A Meeting is an aggregate of an Agenda, a Room, and the Attendees. At implementation time, this relationship is not containment. (A meeting does not contain a room.) Similaraly, the parts of the aggregate might be doing other things elsewhere in the program, so they might be refereced by several objects. In other words, There's no implementation-level difference between aggregation and a simple "uses" relationship (an "association" line with no diamonds on it at all). In both cases an object has references to other objects. Though there's no implementation difference, it's definitely worth capturing the relationship in the UML, both because it helps you understand the domain model better, and because there are subtle implementation issues. I might allow tighter coupling relationships in an aggregation than I would with a simple "uses," for example.

Composition involves even tighter coupling than aggregation, and definitely involves containment. The basic requirement is that, if a class of objects (call it a "container") is composed of other objects (call them the "elements"), then the elements will come into existence and also be destroyed as a side effect of creating or destroying the container. It would be rare for a element not to be declared as private. An example might be an Customer's name and address. A Customer without a name or address is a worthless thing. By the same token, when the Customer is destroyed, there's no point in keeping the name and address around. (Compare this situation with aggregation, where destroying the Committee should not cause the members to be destroyed---they may be members of other Committees).

In terms of implementation, the elements in a composition relationship are typically created by the constructor or an initializer in a field declaration, but Java doesn't have a destructor, so there's no way to guarantee that the elements are destroyed along with the container. In C++, the element would be an object (not a reference or pointer) that's declared as a field in another object, so creation and destruction of the element would be automatic. Java has no such mechanism. It's nonetheless important to specify a containment relationship in the UML, because this relationship tells the implementation/testing folks that your intent is for the element to become garbage collectable (i.e. there should be no references to it) when the container is destroyed. "


ISA implies 'implies (extension is a subset of)'.

a type-ish thing.

different from 'classes' as in code reuse -- that's more like copying the prototype

instead of ISA, mb just use implies (mb even =>)


if nominal class via constants, then compact syntax for declaring a variable of that class, w/o manually setting the relevant constant (and the relevant constants for all superclasses recursively)? or does the prototype provide this for us?


navigability in UML is which way you can traverse many-one (etc) links. represented by arrows.

"Typically, if a role is specified, then navigability in the direction of that role is implicit. If an object doesn't have a role in some relationship, then there's no way to send messages to it, so non-navigability is implicit." -- http://www.holub.com/goodies/uml/

--- what i call 'disjunctive types' may be what c++ calls a "union"

---

implies in relationships; personA is-chair-of committeeB implies personA is-member-of committeeB.

runtime CSP programming? or just logicy shortcut for relations as "first-class" members of the language?

---

mb relations should be considered generalizations of sets just like hyperarcs are generalizations of arcs?

note my hyperarcs are not the usual ones; mine are "directed", hence still the roles of either 'source' or 'target' (src or dst) apply to each; so each of my directed hyperarcs are two sets, the src set and the dst set.

could generalize farther to have additional roles besides dst; this way it merges in with a generalization of relations

namely, a relation of sets, where the first element in each tuple has the special role of 'source' (primary key), and ....

hmmm the specialization of the multiple dst roles seems clumsy... mb not...

anyhow, relations need to be combined into tables (as in relational db)...

--

the relational generalization doesn't fit b/c the point of having a graph-based programming language is that you do lookup table lookups, that is, you request an operation or computation (the arc label) on an object (the node) (alternately, you provide an input (the arc label) to a function (the node)), and then you get a return value or output. the input/output dichotomy is fundamental to computation (you could have a computation-like theory without this, but it would remove the arrow of time). so that's why we need just two roles, src and dest.

it's RDF triples: the function (the src node) is the subject, the arc label is the relation type, and the dst node is the object.

but wait, mb removing time is exactly what we are doing in constraint satisfaction programming. after all, the constraints are operating 'in parallel' and the checks of each possibility to see if it meets the constraints could be operating in parallel. CSP has multiple "inputs" and "outputs" (fixed and free variables), but the constraints themselves are stated without reference to which ones are inputs and which are outputs.

(btw i forgot, what was the syntax for specifying which vars are fixed when we actually solve the constraints)


minimization/max and minmax (and minimin, etc, and their n-ary generalizations) solution in addition to simple (all solution/any solution) constraint satisfaction


no, i think CSP is a generalization in the direction of FEWER roles; there is just one set of input/outputs instead of one set of inputs and one of outputs. no time (0-dimensional) rather than 1-d time (normal) --- having more roles in the relation would be like 2-d or more time; more than one time axis.

btw if the sets can be like inputs and outputs then mb they are ordered, not sets

--- in uml, the 'big guy' (whole of whole/part, owner of has, outer class of outer/inner) is the one with the little shape next to it on the link

---

when one module (or even fn?) calls another, it should be able to do "dependency injection" without the knowledge of the callee, that is, it should be able to redirect which implementations are used for the objects that the callee creates. perhaps it gets to mess with the callee's module namespace, i.e. if the callee imports web.webserver, perhaps instead it gets mock.web.webserver?

this is just an extension of the idea that objects don't have attributes, only methods (so everything can be overridden).

---

mb a "level shifting lang" as in the meta hierarchy ---- imagine a programming language standard across a galaxy populated by many different old intelligent (pre-singularity tho) alien races ---- it would probably be very confusing to us, even though it's basics would be relatively simple, b/c it would be very general. its central abstractions would be deep generalizations of the stuff we usually use, so they would be somewhat confusing or mb even cumbersome for simple stuff. so the language could be 'level shifting' in that there are ways to indicate that, in some syntactic contexts, the full power is desired, and in others, the operators act as the simple special case versions of themselves (which would act like operators familiar to us). e.g. i'm thinking about generalizing graphs to something like directed hypergraphs or something even more complex, but mb in a low-level (simpler) context they would just be graphs, and the operator . would just be normal graph lookup.

---

jasper: 2 block disambig 1st line special in : obj bounday vanilla terminator ref eq avail, only aliasing bad

---

note that jasper syntax allows something similar to matlab-style vertical and horizontal list concat, e.g. in matlab: a = [3 4]; [1 2 a] == [1 2 3 4]

in jasper:

a = [3 4]; [1 2 @a] == [1 2 3 4]


in indices for slicing, like python, or like matlab (think i already said this). matlab seems to do indexing better than python/numpy, but not sure.

mb use "bad" to get out of most compile-time checks (per-line source code annotation; aw heck, per code node) -- but then interpreter or compiler has to be run with 'bad' option or it won't compile


as part of an interface, you can specify 'checked exceptions', i.e. that something following that interface can't emit any other exceptions. but then the caller can 'layer on' permission to emit others (just by catching the others).


mb list unpack and variadic take and variadic fold should be different number of repetitions of the same symbol:

a = [3 4]; [1 2 @a] == [1 2 3 4] 1 2 3 4 @@@variadic_fn 1 2 3 @@+ == 6

mb @@ is fold by default but variadic if defined, so, the last two lines would be

1 2 3 4 @@variadic_fn 1 2 3 @@+ == 6

easier to read and write but less clear.

or mb use the inverse operator:

a = [3 4]; [1 2 -@a] == [1 2 3 4] 1 2 3 4 @variadic_fn 1 2 3 @+ == 6

(harder to type and uglier, but conceptually nicer -- but it's not quite really an inverse anyway, so naw)

so far i like the first one

actually, three different kinds of @s is too much for ppl to remember. so, the second one.

---

abstract classes? why? just define an interface


postfix '-' could mark lots of things (fold, etc), just map by default


delete array rows, columns by slice, i.e.

in numpy,

 In [15]: a = reshape(range(9),[3 ,3])
 In [16]: a
 Out[16]: 
 array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
 In [23]: numpy.delete(a,1, axis=0)
 Out[23]: 
 array([[0, 1, 2],
       [6, 7, 8]])

but in Octave,

 octave:8> a(2,:)  = []
 a =
   0   1   2
   6   7   8

somehow generalize slice notation to make it extensible? but if they want more than binary slice operator, they'll have to define a new operator, b/c each function can only have one arity. so be it, let things other than : be used for slicing. the idea is just that, rather than slices just being a list of inputs to the function (b/c then how would you delete a column of a matrix by slice), they are first-class objects; and rather than only being lines, they can be any "shape" that the object can handle.

vectorize decorator?

---

 way to mark one implementation as a semantic clone of another w/r/t an interface

---

what does it mean to say functions must have the same arity in the land of currying? it means that they must return the same type. what does this mean in a land without principal types? it means that no predicate may be assumed of the return value of a function unless it may be assumed of all instantiations of that function.


subpackages with dots, like python


/, not :, for lambda

  ++ = x y / x+y
for drop-down, which distinguishes "where" blocks from arguments to a function
  3 n + == 5
    n = 2
  : + == 5
    3
    n
  3 n + == :
    1 4 +

(but mb : should be prohibited at EOL and used for something else, like marking where blocks or somesuch..)

_ for discards

 0 _ * = 0

evaluation is recursive within a block, except later repetitons of a variable are distinguished, and mentions in the rvalue of the var in the lvalue are treated as mentions of the previous mention of that symbol and except that an indented block is in-scope in the last line previous to it x y + == z x = 2 y = 3 z = 5

/ allowed in words?

  fun/3 ?
  fun/strict ?

or just fun^strict


topics that i've thought about/to think about:

prototyping, inheritance, composition, delegation parameterized types marked arguments for mapping, ^annotation, data parallelism, strictness $resources, $i18n, KEYWORDS

>, power set graphs, level-shifting lang??

blocks as WHERE maybe' ?metavariables?, quoting graph patterns, predicate types, interfaces {impl attributes}, semantic vs. nonsemantic attributes views boundaries objects, topological/implicit refs, owners, reference aliasing, refs as values seq assertions ! exceptions slices modules, namespaces, dependency injection n-dim arrays CSP, ?metavariables -- or quoting?, probabilities types as sets semantic weby stuff - URIs, triples, inheritance versioning, sync, webdav, different ways of referring to the same thing ez to pass 'self' construction, destruction, field initialization, resource reference counting? read-only many-1 graph assignment state impl switch at runtime? multi-stage programming list expansion, variadic functions, syntactic fold relations

relations, hypergraphs, level-shifting lang, reification, semantic weby stuff - URIs, triples, inheritance; also, versioning, sync, webdav, different ways of referring to the same thing, kant graph patterns, predicate types, parallelism, quantifiers, state, seq CSP, ?metavariables -- or quoting?, probabilities parameterized types, inheritance, construction, destruction, field initialization resource manager, resource reference counting, components maybe' boundaries refs views modules, namespaces, dependency injection multi-stage programming


> could be implemented just by having some nodes (representing sets of nodes) point to a collection of other nodes, and labeling these edges with "refer", and having the links between these higher-level nodes be labeled 'hyperarc'.

the trick is that we want to use the same syntax for manipulating the hyperarcs and the set-noder as we do for manipulating normal arcs and normal nodes, but we also may want to refer to the underlying nodes of a set-node, but then use the usual syntax for operating on those.

more generally, mb what the level-shifting idea (as yet, ill-defined) is is this:

there is a notion of "reference" by which one object may point to another. you can write a sentence with variables refering to the underlying objects, and write the same sentence refering to the referencing objects, and the the same operation will be applied to these different targets (any language will do that). but sometimes you want to "implicitly dereference" the underlying objects, that is, to write out a sentence for an operation on the underlying objects, but instead of using variables that hold them, use variables for the referencing objects that point to them.

  hmm, but it's gotta be more compilicated than that, that would only remove one level of "*"s in C, which isn't too clumsy

mb all that's needed is a restricted form of metaprogramming that maps nodes in the AST?

---

remember, nodes have a set of labels, not just one label


append item to list is common.

collection interface has to include member

multidim collections can always be indexed by single integer (flattened)


named multidim (tuple input?)


three generalizations of nodes:

tuple input meta wrapper hyperarc


for wiring things together:

to permute a list: /[permutation]/ is the function that executes that permutation:

  5 7 /[1 0]/ div == 7 5 div

note that permutations are zero-indexed.

elements of lists within a function output may be references via ., and lists may be created:

  [3 5] 7 /[1 [0.1 0.0]]/ func == 7 [5 3] func

this can be used to wire together functions that return multiple outputs:

  /// x f returns [f1 f2]
  /// y g returns [g1 g2 g3]
  ///  we want to calculate g2 [f2 f1] h
  (x f) (y g) /[1.1 [0.1 0.0]]/ h
    /// equivalent:
  (x f) /[0.1 0.0]/ (y.g) /[1.1 0]/ h

permutations are macros that are equivalent to surrounding everything to the right up to the boundary of their containing expression with parens, and then composing a permutation function with that:

a b /[1 0]/ c f == a (b (x y : y x (c f)))

so, you can apply a permutation to a function, without providing the inputs, to get a new function with remapped inputs:

 f = /[1 0]/ div
 3 5 f == 5 3 div

The standard library function permuteInputs does the same thing, except for it doesn't change grouping:

 f = [1 0] div permuteInputs
 3 5 f == 5 3 div

For the sake of simplicity, variable substitution is not permitted within a permutation specification, i.e. "a b : x y /[a b]/ f" is not permitted.

To permute a function's outputs, use backslashes and put the permutation on the right:

  f = x : [x++1 x--1]
  2 f \[1 0]\ == [1 2]

This is a macro equivalent to:

  2 f \[1 0]\ == 2 f // [1 0] permute

to send different inputs to different functions in a list of functions:

/[permutation] : function1 ;; ... ;; functioni ;; ... ;; functionn/

this creates a function which expects n input arguments, each of which must be a list. according to the permutation's specification, each of these inputs will be sent to one of the functions, and the list will be unpacked as if with the @ operator. the outputs of the n functions will be concatenated into a list:

  a b /[0 1] : f1 f2/ == [(@a f1) (@b f2)]
  [a b] [c d e] /[1 0] : f1 f2/ == [(c d e f1) (a b f2)]

variable substitution is permitted into the function list, e.g. f1 f2 : a b /[0 1] : f1 f2/

this may not seem very useful; "[(c d e f1) (a b f2)]" is shorter an easier to read than "[a b] [c d e] /[1 0] : f1 f2/". This is intended to be used in conjunction with higher order functions, specifically when you have to pass in something that takes its inputs in a given order, and you have TODOwrong

  --

i am bothered by the asymmetry between having multiple inputs just lying there in the code, and having to have to pack multiple outputs into a list. why not just let functions actually produce multiple outputs, which then can directly be used as multiple inputs? the analog of currying is copying a function once for each of its outputs and applying the appropriate projection function. e.g. if f(x) provides two outputs, x+1 and x+2, then x f = x++1 ;; x++2 f / ++ == x : x 1 x 2 *++

note that / (function composition with grouping) operates over all channels at once. now all the permutation stuff can be expressed as compositions of function compositon with uncurrying and a normal permutation on lists and re-currying. so m.b. now want to express it by just attaching / to the permutation list. also, now that outputs aren't lists, don't need the annoying 0.1 syntax, and can having variable substitution into the permutation list (provided there are no list expansions, because the length of the permutation list must stay fixed to determine arity) (if the permutation is determinable at compile-time then the corresponding lambda expression wrapper can be used; otherwise uncurry, permute, curry must be used directly). examples:

  5 7 /[1 0]/ div 
  == 5 7 / x y : [x y] / [1 0] permute / 7 5 (l : l.0 l.1 div)  // note that variable substitution can be
                                                                    accomodated within [1 0] \\
  == 5 7 / x y : y x div 
  == 7 5 div
  x f = x+1 ;; x++2
  f /[1 0]/ == x++2 ;; x++1
  f /[1 0]/ div == x++2 x++1 div
  in general:
  f /perm/ g == f / perm / g == (f / perm) / g == f / (perm / g) /// due to associativity of fn composition

python has

    for surfNodeIdx,vs in enumerate(surfaceToVolume):

but we should just be able to do

    for surfNodeIdx,vs in surfaceToVolume:

also, in python, if you try to enumerate over a hash, you get the keys, not the key-value pairs. you have to use hash.iteritems instead of enumerate, which is confusing.


mb treat orphan lines as implicit 'prr's, like matlab:

x ;; y x = 3 x y = 4

should return "3 4" and print "3\n" to stdout or stderr or stddbg; mb deactivated in non-debug mode


in numpy, this is confusing:

In [25]: len(nonzero([0])) Out[25]: 1

just use _ for arguments insteadof where (and $ for blocks)

go: dont actually import upon import, then recursive imports are a waste

$ standalone x is first positional, nondefault : with no vars instead of $

\ for non-EOL

for type

.. for slicing (not :)


simple syntax for "dual" parts of algorithms, i.e. when you have a > part and a < part and they are dual (or < and >=), so you don't have to write both parts

duality doesn't always just mean < vs >=; sometimes you do an ascending vs. a descending sort, something you add 1 vs. not adding anything; need a general device like "the glutamatergic (gabaergic) neurons usually excite (inhibit)" in English

mb \(), i.e.:

\if a > b: if c <=\(>) d: c = c+1 \(pass) \else:

\if a > b: if c <=\(>) d: (c = c+1)\(pass) \else:

or w/ footnotes:

\if a > b: if c <=\1 d: (c = c+1)\2 \else: \1: > \2: pass

in that toy example it doesn't look very useful, but consider this real example:

def considerMovingLayerBoundaryUp(currentSurfaceNodeVoxels, voxelDepths, currentThisSurfaceNodeLayerDepths, whichLayerBoundary): #ascending sort voxelsByDepth = currentSurfaceNodeVoxels[argsort(voxelDepths[currentSurfaceNodeVoxels])] voxelsAboveOldBoundaryMask = voxelDepths[voxelsByDepth] >= currentThisSurfaceNodeLayerDepths[layerBoundary] voxelsAboveOldBoundary = nonzero(voxelsAboveBoundaryMask)[0] if not len(voxelsAboveOldBoundary): # if there aren't any nodes above the boundary return (currentThisSurfaceNodeLayerDepths, [], []) else: # we don't want to mutate currentThisSurfaceNodeLayerDepths new_layer_depths = copy(currentThisSurfaceNodeLayerDepths) # move the layer boundary up to encompass this node first_voxel_to_move = voxelsByDepth[voxelsAboveOldBoundary[0]] new_layer_depths[whichLayerBoundary] = voxelDepths[voxel_to_move] + epsilon # calculate voxels_moved and destination_layer voxelsBeneathNewBoundaryMask = voxelDepths[voxelsByDepth] < new_layer_depths[layerBoundary] voxels_moved = nonzero(voxelsAboveOldBoundaryMask && voxelsBeneathNewBoundaryMask) destination_layer = whichLayerBoundary return (new_layer_depths, voxels_moved, destination_layer)

def considerMovingLayerBoundaryDown(currentSurfaceNodeVoxels, voxelDepths, currentThisSurfaceNodeLayerDepths, whichLayerBoundary): #descending sort voxelsByDepth = currentSurfaceNodeVoxels[argsort(-voxelDepths[currentSurfaceNodeVoxels])] voxelsBeneathOldBoundaryMask = voxelDepths[voxelsByDepth] < currentThisSurfaceNodeLayerDepths[layerBoundary] voxelsBeneathOldBoundary = nonzero(voxelsBeneathBoundaryMask)[0] if not len(voxelsBeneathOldBoundary): # if there aren't any nodes beneath the boundary return (currentThisSurfaceNodeLayerDepths, [], []) else: # we don't want to mutate currentThisSurfaceNodeLayerDepths new_layer_depths = copy(currentThisSurfaceNodeLayerDepths) # move the layer boundary up to encompass this node first_voxel_to_move = voxelsByDepth[voxelsBeneathOldBoundary[0]] new_layer_depths[whichLayerBoundary] = voxelDepths[voxel_to_move] # calculate voxels_moved and destination_layer voxelsAboveNewBoundaryMask = voxelDepths[voxelsByDepth] >+ new_layer_depths[layerBoundary] voxels_moved = nonzero(voxelsBeneathOldBoundaryMask && voxelsAboveNewBoundaryMask) destination_layer = whichLayerBoundary + 1 return (new_layer_depths, voxels_moved, destination_layer)

that could be changed to:

\def considerMovingLayerBoundaryUp\1(currentSurfaceNodeVoxels, voxelDepths, currentThisSurfaceNodeLayerDepths, whichLayerBoundary): #ascending sort \(descending sort) voxelsByDepth = currentSurfaceNodeVoxels[argsort(\(-)voxelDepths[currentSurfaceNodeVoxels])] #note: beneath and above are flipped in the \dual voxelsAboveOldBoundaryMask = voxelDepths[voxelsByDepth] >=\(<) currentThisSurfaceNodeLayerDepths[layerBoundary] voxelsAboveOldBoundary = nonzero(voxelsAboveBoundaryMask)[0] if not len(voxelsAboveOldBoundary): # if there aren't any nodes above the boundary return (currentThisSurfaceNodeLayerDepths, [], []) else: # we don't want to mutate currentThisSurfaceNodeLayerDepths new_layer_depths = copy(currentThisSurfaceNodeLayerDepths) # move the layer boundary up to encompass this node first_voxel_to_move = voxelsByDepth[voxelsAboveOldBoundary[0]] new_layer_depths[whichLayerBoundary] = voxelDepths[voxel_to_move] (+ epsilon)\() # calculate voxels_moved and destination_layer voxelsBeneathNewBoundaryMask = voxelDepths[voxelsByDepth] <\(>=) new_layer_depths[layerBoundary] voxels_moved = nonzero(voxelsAboveOldBoundaryMask && voxelsBeneathNewBoundaryMask) destination_layer = whichLayerBoundary \(+1) return (new_layer_depths, voxels_moved, destination_layer)

\1: considerMovingLayerBoundaryDown

the latter is much better for maintenance b/c when you have to change something, you only have to change it once. could be generalized to n-ary brothers, but mb that is more confusing

--- allow comma-separated lists if there's a comma?


distinguish indices from actual integers (and indices-of-indices, etc)

---

def macro so that fn defs are ez to cut and paste; or mb just if there are more than one equals in a line, the second acts like a :? no, but the fn name...?



should we have syntax for filtering a list, or just have comprehensions?

comment

universal and existential quantifier: &&all

any

could use / \ for grouping? meta?

should a.b be a\b instead? or is \ too hard to find on kbd?

if & is AND and && is forall, then should be have something similar for "not"? mb _ is "not" and discard is "__"? or, since we can't have negation in patterns, just have _ be discard in patterns?

should ,, be EOL, instead of ;; or ;?

note: runtime inheritance means state io's blocks (which have parent pointing to the namespace in which they were created) and objects (whose parents point to the obj parent) are giving state. closures are more than just compile-time scope (like haskell), they r state

call init upon clone like in Io; also, detect runtime inheritance slot lookup loops at runtime

"symbol map" style metaprogramming; i.e. a transformation over symbols


Footnotes:

1.