lists-programmingConstructs2

---

" taw 4 points5 points6 points 3 years ago[-]

I haven't used Haskell type system goodies (like type classes) that much, but OCaml type system is anything but "very expressive".

OCaml doesn't even let (fun a b -> a + b) apply to a pair of floats, let alone your own types.

You want 2d point and 3d point records and be able to say (a.x, a.y, a.z) and (b.x, b.y) ? In OCaml names of record fields must be unique inside the whole program, so the best you can do is (a.point3d_z, b.point2d_x).

These two things work fine in Haskell, Nemerle and even SML.

Or the other way around. OCaml is smart enough to know that (Printf.sprintf "%s <%d, %d>") is a function that takes a string and two integers and returns a string, while SML and Haskell have absolutely no way of expressing something like that.

Another example. Let's try to write a function that takes a function that does something with lists (like identity or reverse or length) and applies it to two different lists:

    (fun f -> (f [1; 2; 3], f ["a"; "b"]));; It is a perfectly good function, but OCaml won't let it compile because its type system isn't good enough. It won't even let us explicitly type f (well there's always Obj.magic that ignores type system completely, but OCaml like C has zero runtime checks, so if we make a mistake it segfaults, istead of raising a pretty exception with full stack trace)

Haskell won't let us compile this either:

    (\f -> (f [1, 2, 3], f ["a", "b"]))

Neither will Nemerle:

    fun(f) { (f([1, 2, 3]), f(["a", "b"])) }

What the heck is going on ? Well, of course they have a pretty good excuse for this case - the most generic shape of type term for f cannot be uniquely determined, and unification-based type inference engines (SML, OCaml, Haskell etc.) for fundamental reasons cannot work with such cases. Nemerle uses constraint solver-based type inference engine, which is a bit more powerful, but doesn't handle this case (yet ?). No matter which language you'll take, the type system will keep getting in your way, and perfectly valid programs will refuse to compile.

Static typing and type inference may still be well worth it, I'm not saying they're not. But "very expressive" ? Come on.

sleepingsquirrel 3 points4 points5 points 3 years ago[+] (0 children)

sleepingsquirrel 3 points4 points5 points 3 years ago[-]

    OCaml is smart enough to know that (Printf.sprintf "%s <%d, %d>") is a function that takes a string and two integers and returns a string, while SML and Haskell have absolutely no way of expressing something like that.

Text.Printf

    Haskell won't let us compile this either: (\f -> (f [1, 2, 3], f ["a", "b"]))

Try with "ghc -fglasgow-exts" or "hugs -98"...

main = print $ f reverse

f :: (forall a. ([a]->[a])) -> ([Integer],[Char]) f x = (x [1,2,3], x "abc")

Excedrin 1 point2 points3 points 3 years ago[+] (3 children)

Excedrin 1 point2 points3 points 3 years ago[-]

    Or the other way around. OCaml is smart enough to know that (Printf.sprintf "%s <%d, %d>") is a function that takes a string and two integers and returns a string, while SML and Haskell have absolutely no way of expressing something like that.

In my opinion, the functional unparsing http://www.brics.dk/RS/98/12/ style printf is much better than the OCaml or SML/NJ printf library functions. Here's an SML implementation of the idea: http://mlton.org/Printf

taw 0 points1 point2 points 3 years ago[+] (2 children)

taw 0 points1 point2 points 3 years ago[-]

Internally OCaml stdlib (not the language) converts format to something like the thing you linked to. It's somewhat more readable and more convenient to write:

    (sprintf "<%f, %f, %f>" a.x a.y a.z)

than:

    (sprintf &#x60;"<"F&#x60;", "F&#x60;", "F&#x60;">" $ a.x a.y a.z)

And either of them is far better than (printf-less OCaml):

    ("^" ^ (string_of_float a.x) ^ ", " ^ (string_of_float a.y) ^ ", " ^ (string_of_float a.z) ^ ">")

Maybe it's a small thing, but it's very common. If you add enough such small things, the difference can get really huge.

Excedrin 1 point2 points3 points 3 years ago[+] (1 child)

Excedrin 1 point2 points3 points 3 years ago[-]

There's an OCaml implementation here:

http://tkb.mpl.com/~tkb/software.html

Also, a Haskell version:

http://www.cs.nott.ac.uk/~gmh/wgp01/hinze-paper.pdf

My point was more that "... SML and Haskell have absolutely no way of expressing something like that" isn't exactly correct for some values of "something like that." Also, a format string isn't exactly the cleanest solution based on my idea of clean staticly typed functional language code.

I'm glad that someone else has already figured out "~:D" and "~S" and they're part of the spec, documented and ready to use. Maybe it's a small thing, but it's very common. If you add enough such small things, the difference can get really huge.

taw 0 points1 point2 points 3 years ago[+] (0 children)

taw 0 points1 point2 points 3 years ago[-]

Oh well, I guess it's good enough.

A major difference is that while Haskell solution only does verification at run time (not that I really mind):

    "Mismatch between the argument types and the format string will cause an exception to be thrown at runtime."

OCaml does so at compile time, by smoke and mirrors :-)

    Printf.sprintf "%s<%d, %d>";;

"

http://www.brics.dk/RS/98/12/

toread: http://nemerle.org/blog/archive/2005/Mar-13.html

toread: http://www.paulgraham.com/arclessons.html

http://www.paulgraham.com/arclessons.html : "Implicit local variables conflict with macros."

toread: http://www.reddit.com/r/programming/search?q=erlang&sort=top

libraries: core need to use to implement the compiler or interpreter? or mb just really important -- everyone should kno really well -- consider these part of the language base libs that are not core but still almost part of the language. available on any (normal) installation. ppl should become familiar with them incorporated but not shipped some libs may be "owned" by the main team but not available on any (normal) installation. recommended this category is lacking in many languages. these are libs that are not necc shipped w/ the language, not owned by the main team, mb not as stable, but which are officially blessed. the idea isn't that the main team strongly endorses the current state of the lib, but rather that they say, 'ppl r using this, this seems to be the frontrunner for what it does, you should be aware that it's out there and consider using it'. like gvr blessing django.

i don't understand this: http://archive.netbsd.se/?ml=haskell-libraries&a=2007-07&m=4678486

random guy's idea, toread. one idea is that "start" and "stop" messages are basic.

http://rebelscience.blogspot.com/2008/10/cosa-new-kind-of-programming-part-iii.html


decided not to do this:

 An exception to this equivalence (and to right-associativity) is that if the function as written takes k arguments, but it returns a function, then you are not allowed to write it with than k arguments next to it; the extra arguments must be separated from the "real" arguments by a '(' or or a ',' or a '.'. So, for example, if f is a function that takes two arguments, which are themselves function, and returns g, which is a function that takes two arguments, then according to right-associativity you would be able to write:

1 2 x y f


plugins: envisage http://code.enthought.com/projects/envisage/


note: when B and C don't override, the http://en.wikipedia.org/wiki/Diamond_problem diamond inheritance problem solved by keeping track of where the method originally came from, not where it immediately came from


http://lambda-the-ultimate.org/node/36373

A few parallel designs...

Peter Van Roy has written a few documents that survey different forms of concurrency. CTM and a recent chapter Programming Paradigms for Dummies [LTU Node] include such surveys.

Tim Sweeney's Next Mainstream Programming Language [LTU Node] includes a survey of a few and their 'places' in game design. He mentions transactions above, and support for wide-vector parallelization qualifies as a form of data-parallel computation.

My own interest is in 'open' distributed computing, but concurrency is part and parcel to distributed designs.

Data-Parallel computation, possible with pure logic computation and pure functional computation, allow parallelization without affecting semantics. This is generally considered "parallelization" rather than "concurrency" because only the implementation is concurrent, not the semantics.

Some systems, like Google's MapReduce?, aim to achieve data-parallel computations by shipping the computation to a much larger data-set. Wide-vector operations seen in supercomputers, graphics cards, and to a lesser degree in modern processors achieve this at a much smaller degree, but still achieve useful performance boosts.

The main challenge with data-parallel is deciding where to break up the computations - a challenge greatly exacerbated by the fact that the cost-to-benefit ratio of any given breakdown depends heavily on both the implementation architecture and on pressure from other sources of concurrency. If decided by hand, there is a high syntactic overhead, and the decision is likely too 'local' - the annotations will need be adjusted for different architectures. One may do better with some sort of adaptive design that uses profiling to decide where to break a computation down.

For distributed data-parallelization, one must also manage inevitable operations and node failures, which isn't a trivial problem. One can't fire and forget.

Functional Reactive Programming, with data-binding allows a great deal of parallelization and also has very nice features for caching, distribution and efficient multi-cast, disruption tolerance, and minimizing latencies. I personally think it to be an excellent bet for describing scene graphs. I think PVR called this 'synchronous dataflow programming' in paradigms for dummies. One can also do logic-programming with data-binding, which is neat.

In general, this is implemented such that updates invalidate caches eagerly and subscribers recompute lazily. One must avoid a common fallacy where one sees 'glitches' where, for example, x changes from 3 to 4 atomically but observers of (x*(x+2)) see 15->20->24 or 15->18->24.

Presumably, data-binding could be usefully combined with transactions to allow more than one variable to update as a single atomic action. However, I have not seen this in practice.

Blackboard Architecture and Tuple Spaces achieve concurrency by having a bunch of active agents talking to a central database. One can generally wait until a tuple with a given pattern is identified. This has a major advantage in that the elements are not temporally coupled and have excellent disruption tolerance. One can also perform ACID transactions on the central database.

The main weaknesses of this design. (1) Coordination of components to perform a shared activity becomes very complex. Different components cannot see each-other's transactions until they are completed, so multi-step behaviors between components cannot be performed atomically. Due to this limitation, individual components of the system will have inherent pressure to 'bloat up' to reinvent features rather than coordinate to achieve them. (2) Cleanup. Any given update might be observed by many components over a long period of time. It is rarely clear when or by whom a tuple should be eliminated from the repository. One can use expirations and such, but ultimately it's going to be heuristic and best guess. (3) Security and sharing (across organizations) - the use of a central database to coordinate elements is common, but not fundamentally secure. Such systems tend to have 'egg-shell' security - rather fragile, and a gooey middle that can rot quite easily after a breach.

Central point of failure is also a potential concern, but may be mitigated by varied design of the database for redundancy or distribution.

One could presumably use micro-databases to coordinate and secure them as object capabilities, but then distribution of the micro-db caps is a problem and so is multi-database pattern-matching and multi-database updates (one could fall back on FRP and distributed transactions, though).

Complex Event Processing achieves concurrency by having components produce and react to events.

Unlike message passing concurrency, there is reduced coupling between producer and consumer, usually via some sort of 'channels' or 'component BUS' or other publish subscribe model. The reduced coupling is advantageous for distribution and modularity, and also allows the system as a whole to continue even when individual nodes fail.

The main challenges of the CEP architecture occur upon startup and shutdown, as components may 'miss' important events. Usefully, one can hybridize with the 'database' approach and have the network itself keep a history to reduce startup and shutdown issues.

There is much promise in combining CEP with distributed transactions, where a sequence of events - including reactions from listeners - may be party of a common (hierarchical) transaction.

One can also optimize CEP by ensuring it is demand-driven: a source for events is made available in a registry, but isn't 'activated' until there is a corresponding sink for events. This design can, for example, automatically enable and disable a webcam based on whether there is a frame-demand. This sort of demand-driven nature can be chained through a CEP system by creating each CEP 'channel' with three facets: one to subscribe, one to update, and one suitable for data-binding that simply indicates whether someone is subscribed. (An object-per-facet is quite suitable for object capability security...)

I also propose Simple Event Processing (SEP): like CEP with a few limitations aimed to achieve greater 'implicit' parallelization by sharing computation efforts. SEP forbids creation of new events, forbids operations that reference more than one event, and forbids side-effects (no internal state, no 'sends'). An SEP network could union, filter, and transform event streams, and may be combined with functional-reactive and data-binding to filter, transform, and union streams based on mutable policies. The main advantages of this tweak: (1) it becomes possible to automatically optimize multi-cast and sharing of common 'sub-network' computations (useful when a few thousand observers might be watching an ad-hoc 'query' whose results you don't wish to name and prepare explicitly in advance). (2) Demand-driven nature can be enforced automatically end-to-end on the SEP network. CEP could then be implemented at the very 'edges' of an SEP network. To my knowledge, these optimizations have never been targeted by an existing implementation.

Publish, Subscribe models focus on distribution of events and data (of the sort used by SEP/CEP/FRP). A given model determines how one describes a subscription, and how much the model 'knows' about the data being published. Of particular interest are 'domain-centric' schemes where one can (say) describe data and event feeds in terms of geographic and geometric regions (tell me about crime within 2 miles of lat/lon). This sort of scheme would work well with massively concurrent systems with a potentially large number of multi-cast consumers, such as an MMORPG.

Publish-subscribe can be securely composed by having capability-secure but composable 'registries' for data-sources. I.e. I can create a new registry and add some sources, have it compose Google's registry, so I can see Google's data but Google cannot see mine. The trick is to achieve this coordination without introducing a 'demand' on the event and data sources, without hindering multi-cast optimizations, and potentially under some anonymity/blinding constraints (e.g. via a trusted intermediary). That takes some design effort, but is doable. By dmbarbour at Thu, 2009-10-15 19:50

Thanks - Looking also for coarser grained mechanisms
login or register to post comments

For a crude example, servlets operating under a web application server; or simple 'grep' and 'wc' style utilities tied together in a Unix pipeline. Etc.

Again, thanks much!

Scott By scottmcl at Thu, 2009-10-15 23:01

p.s. On the lower level control mechanism .....
login or register to post comments

Given the ire drawn by the threads + locks model, I'm curious what folks think this the most elegant concurrency control mechanism/framework.

Again, mucho, mucho thanks in advance.

Scott By scottmcl at Thu, 2009-10-15 23:04

Most Elegant Threads...
login or register to post comments

The threads+locks+shared-state model contradicts many principles associated with 'elegance'. For example, elegant systems should be composable, but two lock-based programs - independently correct - may deadlock when composed.

Shared state design also introduces challenges for explicit memory management, mutable collections (esp. with idioms that map an operation across the collection), visitor patterns, etc.

That said, threads and shared state can work without locks. Wait-free and lock-free algorithms and data-structures can be reasonably elegant. And a language with greater support for 'persistent' structures (such as ropes, copy-on-write maps, etc.) avoid many problems of shared state. By dmbarbour at Fri, 2009-10-16 16:56

wait-free can be hard, too
login or register to post comments

i got the impression that writing a wait or lock free system was one of those Really Hard To Get Right So 99% Of Us Shouldn't Think They Can things. By raould at Fri, 2009-10-23 21:25

totally agreed
login or register to post comments

Wait-free is really, really, really hard to get right.

That's why I said 'elegant' and not 'easy'. Like ballet. ^_^ By dmbarbour at Fri, 2009-10-23 21:28

login or register to post comments

http://en.wikipedia.org/wiki/Fortress_%28programming_language%29


http://lambda-the-ultimate.org/node/3465


note: matrices are both containers (as matrixes) and fns (as linear operators), providing a potential reason to distinguish b/t __get (subscript) and __call (fn call). however, they are really only linear ops w/r/t a basis, so mb these are separate objects? and __call = __get captures the identity of sequences w/ functions on integers


should be a shortcut for this sort of thing:

  1. compute conditional probabilities for i in range(len(domain.attributes)): for j in range(len(domain.attributes[i].values)): for k in range(len(domain.classVar.values)): pc[i][j][k] = (pc[i][j][k] + self.m * p_class[k])/ (n_class[k] + self.m)

list of neat programming languages: http://tldp.org/HOWTO/AI-Alife-HOWTO-7.html


some ppl who think about pls:

lambatheultimate

dick gabriel (sun?) http://www.dreamsongs.com/ steele gvr sussman haskell guys, etc


read the rest of this: http://www.dreamsongs.com/10ideas.html


listen to this:

http://www.podbean.com/podcast-detail-episode/342059/episode-19-keynote----50-in-50


why you need to use a symbol table during parsing of c++:

http://compilers.iecc.com/comparch/article/98-07-199 http://stackoverflow.com/questions/1725975/no-symbol-table-in-go


why the "go" language doesn't have assertions:

" Where is assert?

Go doesn't provide assertions. They are undeniably convenient, but our experience has been that programmers use them as a crutch to avoid thinking about proper error handling and reporting. Proper error handling means that servers continue operation after non-fatal errors instead of crashing. Proper error reporting means that errors are direct and to the point, saving the programmer from interpreting a large crash trace. Precise errors are particularly important when the programmer seeing the errors is not familiar with the code.

The same arguments apply to the use of assert() in test programs. Proper error handling means letting other tests run after one has failed, so that the person debugging the failure gets a complete picture of what is wrong. It is more useful for a test to report that isPrime gives the wrong answer for 2, 3, 5, and 7 (or for 2, 4, 8, and 16) than to report that isPrime gives the wrong answer for 2 and therefore no more tests were run. The programmer who triggers the test failure may not be familiar with the code that fails. Time invested writing a good error message now pays off later when the test breaks.

In testing, if the amount of extra code required to write good errors seems repetitive and overwhelming, it might work better as a table-driven test instead. Go has excellent support for data structure literals.

We understand that this is a point of contention. There are many things in the Go language and libraries that differ from modern practices, simply because we feel it's sometimes worth trying a different approach. "

i note that if you have exceptions (go doesn't) and if you can catch an exception and tell the language to just continue on from where it left off (unlike python, but i think some languages do this), then a tester can cause assertions to be ignored just by calling "main" and catching assertion exceptions and ignoring them and telling the computer to just continue on where it left off


interesting bit from the go faq (altho i guess its not relevant to jasper b/c we dont have inheritance b/c of the 'you can only refer to an interface' philosophy):

Why is there no type inheritance?

Object-oriented programming, at least in the best-known languages, involves too much discussion of the relationships between types, relationships that often could be derived automatically. Go takes a different approach.

Rather than requiring the programmer to declare ahead of time that two types are related, in Go a type automatically satisfies any interface that specifies a subset of its methods. Besides reducing the bookkeeping, this approach has real advantages. Types can satisfy many interfaces at once, without the complexities of traditional multiple inheritance. Interfaces can be very lightweight having one or even zero methods in an interface can express useful concepts. Interfaces can be added after the fact if a new idea comes along or for testing without annotating the original types. Because there are no explicit relationships between types and interfaces, there is no type hierarchy to manage or discuss.

It's possible to use these ideas to construct something analogous to type-safe Unix pipes. For instance, see how fmt.Fprintf enables formatted printing to any output, not just a file, or how the bufio package can be completely separate from file I/O, or how the crypto packages stitch together block and stream ciphers. All these ideas stem from a single interface (io.Writer) representing a single method (Write). And that's only scratching the surface.


think about how to express concurrency constructs more elegantly using graphs

how to make monitors, timber construcs in an extensible way


tables, associative arrays, maps, dicts, hashtables. i guess "map" is a good name. in jasper, the map data structure is also actually a function, so this is even better.


http://lambda-the-ultimate.org/node/1394

....

Other languages are designed exactly for these things that Proebsting mentions: XML manipulation is prominent in CDuce, XDuce, and Scala, and in the LL2 video Proebsting proposes Erlang as a solution to the concurrency problem.


as noted on http://en.wikipedia.org/wiki/Parameter_covariance , there is an issue if, for example, String is a subtype of Object, and arrays of strings are subtypes of arrays of objects, since in fact you can't substitute a String array for any Object array; the String array won't be able to accomodate an insertion of an Int, for example.

in Jasper, there would be an Object array implementation (parametric polymorphism/generic programming, using a type variable, i.e. "?x Array", and a String array implementation. If a list of only Strings was required, the String array implementation would be selected, as it is more specific.


http://en.wikipedia.org/wiki/Parameter_covariance#Java has an interesting feature with the wildcards in parametric polymorphism that can have upper or lower bounds (covariant or contravariant)


some interesting ones from MUMPS http://en.wikipedia.org/wiki/MUMPS:

Arrays: are created dynamically, stored as B-trees, use almost no space for missing nodes, can use any number of subscripts, and subscripts can be strings or numeric (including floating point). Arrays are always automatically stored in sorted order, so there is never any occasion to sort, pack, reorder, or otherwise reorganize the database. $ORDER, $ZPREVIOUS, and $QUERY functions provide efficient traversal of the fundamental array structure, on disk or in memory.

for i=10000:1:12345 set sqtable(i)=i*i set address("Smith","Daniel")="dpbsmith@world.std.com"

Local arrays: variable names not beginning with caret are stored in memory by process, are private to the creating process, expire when the creating process terminates. The available storage depends on partition size, but is typically small (32K).

Global arrays: ^abc, ^def. These are stored on disk, are available to all processes, and are persistent when the creating process terminates. Very large globals (eg, hundreds of megabytes) are practical and efficient in most implementations. This is MUMPS' main "database" mechanism. It is used instead of calling on the operating system to create, write, and read files.

Indirection: in many contexts, @VBL can be used, and effectively substitutes the contents of VBL into another MUMPS statement. SET XYZ="ABC" SET @XYZ=123 sets the variable ABC to 123. SET SUBROU="REPORT" DO @SUBROU performs the subroutine named REPORT. This is effectively the operational equivalent of "pointers" in other languages.

Piece function: This breaks variables into pieces guided by a user specified separator character. Those who know awk will find this familiar. $PIECE(STRINGVAR,"^",3) means the "third caret-separated piece of STRINGVAR." It can appear as an assignment target. After

SET X="dpbsmith@world.std.com"

$PIECE("world.std.com",".",2) yields "std" SET $P(X,"@",1)="office" causes X to become "office@world.std.com" (note that $P is equivalent to $PIECE and could be written as such).

Order function

Set stuff(6)="xyz",stuff(10)=26,stuff(15)=""

$Order(stuff("")) yields 6, $Order(stuff(6)) yields 10, $Order(stuff(8)) yields 10, $Order(stuff(10)) yields 15, $Order(stuff(15)) yields "".

Set i="" For Set i=$O(stuff(i)) Quit:i= Write !,i,10,stuff(i)

Here, the argument-less For repeats until stopped by a terminating Quit. This line prints a table of i and stuff(i) where i is successively 6, 10, and 15.

Multi-User/Multi-Tasking/Multi-Processor: MUMPS supports multiple simultaneous users and processes even when the underlying operating system does not (Eg. MS-DOS). Additionally, by specifying a machine name in a variable (as in SET ^

"DENVER"A(1000)="Foo"), you can access data on remote machines.

from icon http://en.wikipedia.org/wiki/Icon_programming_language

  every write(someFunction(i to j))

like

  someFunction.i:j write map

notes on PlusCal? http://research.microsoft.com/en-us/um/people/lamport/pubs/pluscal.pdf

from Lamport's website: "

  1. The PlusCal? Algorithm Language Theoretical Aspects of Computing-ICTAC 2009, Martin Leucker and Carroll Morgan editors. Lecture Notes in Computer Science, number 5684, 36-60. PDF PlusCal? (formerly called +CAL) is an algorithm language. It is meant to replace pseudo-code for writing high-level descriptions of algorithms. An algorithm written in PlusCal? is translated into a TLA+ specification that can be checked with the TLC model checker [126]. This paper describes the language and the rationale for its design. A language manual and further information are available here.

An earlier version was rejected from POPL 2007. Based on the reviews I received and comments from Simon Peyton-Jones, I revised the paper and submitted it to TOPLAS, but it was again rejected. It may be possible to write a paper about PlusCal? that would be considered publishable by the programming-language community. However, such a paper is not the one I want to write. For example, two of the three TOPLAS reviewers wanted the paper to contain a formal semantics--something that I would expect people interested in using PlusCal? to find quite boring. (A formal TLA+ specification of the semantics is available on the Web.) I therefore decided to publish it as an invited paper in the ICTAC conference proceedings. "

Most of the PlusCal? examples are not suitable as a programs because they define things in terms of brute-force enumeration through all possibilities (this is fine for PlusCal?'s purpose, but here i'm looking at it for ideas for programming language constructs) from which items meeting certain criteria are chosen (i.e. basically, a common idiom in PlusCal? is to pose an N.P. complete problem whenever you want something done, that is, to give a set of easily-checked conditions that you want satisfied and a set of potential solutions to enumerate over). However, it's still worth combing through for constructs:

function ("algorithm") defn -- already in jasper

predicate defn -- already in jasper (via the constraint satisfaction stuff)

the above mentioned brute force syntax: e.g. Divides(i,j) = \exists k \in 0..j : j = i * k. this style requires lists, forall, exists, and "with" nondeterminism, see below.

\forall and \exists, which take predicates over elements and return predicates over lists, written e.g. \exists k \in LIST : CONDITION. these should be lib fns in jasper.

multiple assignment: u := v

v := u is like t := u; u := v; v := t. the key is that all the things on the right are evaluated before any of the things on the left
  in Jasper, this will be a special case of graph assignment.

"with" nondeterminism: using the "with" keyword, a value is selected via exhaustive search that satisfies a condition. in jasper, this is given by the constraint satisfaction programming stuff, which needs to be augmented by a possibility list.

conversion of the set of values satisfying constraints into an explicit list, e.g. B \in {C \in Perms(A) : condition1 AND condition2}. In this case, b/c we are using "B \in {}", this list is just used as another constraint, but the {} operator itself seems like it could be used to create an explicit list.

goto, and labels

"await", which blocks until some condition is true e.g. await y=0

arrays

if

atomic operations e.g. "<y := 0>" in the pseudo-code or "label1: y := 0" when labels are used to indicate atomicity

n-way fork

boolean ops, std comparison ops, set union

can quantify (i.e. exhaustively search) over finite arrays = quantify over functions with finite domains (e.g. Auto(S) = {f \in

S -> S : ...})

set subtraction e.g. Nat \ {0} for positive integers

map e.g. {i \in Nat : i > 0}

filter e.g. {i+1 : i \in Nat}

"domain" function, which fetches the finite domain of some fn over a finite domain (i.e. some array)

while loops

"either S1 or S2 or ...." for nondeterministic branching

http://en.wikipedia.org/wiki/Guarded_Command_Language GCL's do: Repetition: do

The repetition executes the guarded commands repeatedly until none of the guards are true. Usually there is only one guard. [edit] Syntax

do G0 \rightarrow S0 [] G1 \rightarrow S1 ... [] Gn \rightarrow Sn od

example: euclid algorithm for GCD:

a, b := A, B; do a < b \rightarrow b := b - a [] b < a \rightarrow a := a - b od

GCL's cond (called if) is nondeterministic in that if more than one condition is true, more than one of the associated statements may be executed. my note: a variant in which all the associated statements of all conditionals seems are executed seems more appropriate.

note the distinction between the use of cond in imperative languages and the use of case in expressions; if none of the conditions are true in the imperative setting, an obvious default is to do nothing; whereas in the expression (functional) setting there is no obvious default. however, one may also view the imperative commands as returning "success" or "fail" in addition to their side-effects; in this case, it is unclear whether an imperative cond construct should return success or fail if none of the conditions are met (my pick would be to have it return fail, but then to have a lightweight syntax for disregarding failure)

exceptions may be status reports, warnings (resumable and resumed by default) or errors (not resumable). lightweight syntax for ignoring errors on unimportant statements. that is, exceptions and logging can be combined. mb need facility for logger to ask to get all messages from subroutines, even if the caller deems them unimportant and intercepts them. perhaps intercepted messages transform into "ghost" messages, which still percolate up and cannot be blocked (nice analog with christian theology btw; after their death, the ghosts of messages cannot have any material effect upon execution, e.g. by causing an error, but must still ascend up the call chain to heaven, i.e. the First Cause, so that when they reach heaven a record of their content may be written in the log, perhaps in order to assess the performance of the program which generated the message.

so, note that both cond and do seem to admit any/all modal variants; on each iteration of the do, are all applicable branches called, or is just one nondeterministically picked?

according to http://en.wikipedia.org/wiki/Guarded_Command_Language , GCL is good for expressing quasi-delay-insensitive circuits, which sound like a good neural async computation model for the brain. QDI circuits are delay insensitive except for the presence of isochronic forks. asymmetric isochronic forks mean that one path is guaranteed to be shorter than the other. apparently pure delay-insensitive circuits are not Turing complete, and QDI are, so QDI is a candidate for one sort of weakest constraint.


string scanning from Icon:

 s := "this is a string"
 s ? {                               # Establish string scanning environment 
     while not pos(0) do  {          # Test for end of string 
         tab(many(' '))              # Skip past any blanks
         word := tab(upto(' ') | 0)  # the next word is up to the next blank -or- the end of the line
         write(word)                 # write the word
     }
  }

F# LINQ and quoting: http://tomasp.net/articles/dynamic-flinq.aspx


"Perl's ability to "glue together" other programs, or transform the output of one program so it can be used as input to another. "

"A good scripting language is a high-level software development language that allows for quick and easy development of trivial tools while having the process flow and data organization necessary to also develop complex applications. It must be fast while executing. It must be efficient when calling system resources such as file operations, interprocess communications, and process control. A great scripting language runs on every popular operating system, is tuned for information processing (free form text) and yet is excellent at data processing (numbers and raw, binary data). It is embeddable, and extensible. Perl fits all of these criteria. "


http://www.softpanorama.org/People/Wall/index.shtml


kant's category of community is putting a number of parts together into one concept corresponding to the judgement which is the exclusive OR of the parts. this corresponds to an enum in a programming language.

---

model-based evaluation of clusterting validation measures


In the following, "=" means "proportional to"

Let V1_1 refer to the population of neurons in V1 receiving input from LGN, and V2_1 to the population in V2 receiving input from V1. Let

  1. SOME_POP mean the number of cells in population SOME_POP. E.g. #V1_1 means "# of cells in V1_1". Let SOME_AREA_IN be the population of neurons serving as input to SOME_AREA. Let SOME_AREA_TO_SOME_OTHER_AREA be the population of neurons projecting from SOME_AREA to SOME_OTHER_AREA.

Assume that: (1) #V2 = #V1 (2) #V1 = #V1_in^(3/2) (3) V2_1 is doing a 3/2 wavelet transform of whatever its input from V1 is (4) V1_1 dominates the count of cells in V1 (i.e. it scales up as fast or faster than any other subpopulation in V1, to the extent that if you count #V1 you are basically counting #V1_1), and likewise V2_1 dominates #V2 (5) Every cell has an output and every cell's output gets used (6) V2 gets all its input from V1 (100) V1 is not doing any other computation after its initial wavelet transform and before it sends stuff to other areas (i.e. V1 basically only contains V1_1)

Because of (3), we have

(7) #V2_1 = #V2_1_in^(3/2)

Because of (6) and the definition of V2_1, we have

(8) V2_1_in = V1_TO_V2

Because of (4) and (1),

(9) #V1 = #V1_1 = #V2 = #V2_2

therefore, by (7), (8), and (9),

  1. V2_1 = #V1_to_V2^(3/2) V1_to_V2 = #V1_1^(2/3)

By (100), each neuron in V1_to_V2 corresponds to exactly one neuron in V1_1.

Since #V1_to_V2 = #V1_1^(2/3), we know that #V1_to_V2 < #V1_1, and therefore so there are a number of neurons in V1_1 which aren't sending their outputs to V2_1 (call them "orphans"). Since V2_1 is defined as that part of V2 receiving input from V1, there are a number of neurons in V1_1 which aren't sending their outputs to any part of V2. The number of these is proportional to #V1_1^(1/3). By (5), these outputs must be going somewhere, so they must be going to some other area besides V2.

Let's say the orphans are going to Vx_1, and that they are the only input to Vx_1. We would have #Vx_1 = #orphans^(3/2) = (#V1_1^(1/3))^(3/2) = #V1_1^(1/2). So, the scaling relation between

  1. V1_1 and #Vx_1 would not be the same as between #V1_1 and #V2_1. Assuming that #Vx_1 dominates the count of some area VX, we have that there is some cortical area VX that doesn't scale proportionately with V1.

So, if we keep assumpions (1)-(6), then either we have various cortical areas which don't scale proportionately to each other, or (100) must be wrong. (100) could be wrong if we let each cortical area contain an addition "reduction" computation which is done after the initial wavelet transform and which combines the results of many neurons into fewer. So, there could be a V1_2 population of neurons with #V1_2 = #V1_1^(2/3), such that V1_2 projects to V2, not V1_1.


Is the embeddable aspect of Falcon versatile?

I talked diffusely about that in the Regex example above, but other than the reconfigurability and sharing of pre-loaded modules across application vm, we have more. The vm itself has many virtual methods that can be overloaded by the target application, and is light enough to allow a one-vm-per-script model. Heavy scripts can have their own vm in the target application, and can happily be run each in its own thread; yet vms can be recycled by de-linking just run scripts and linking new ones, keeping the existing modules so that they're already served to scripts needing them.

The vm itself can interact with the embedding application through periodic callbacks and sleep requests. For example, a flag can be set so that every sleep request in which the vm cannot swap in a coroutine ready to run is passed to the calling application, that can decide do use the idle time as it thinks best. For instance, this allows spectacular quasi-parallel effects in the FXChat binding, where the sleep() function allows Xchat to proceed. This may seem a secondary aspect, but other engines are actually very closed on this; once you start a script or issue a callback, all that the application can do is to hope that it ends soon. With Falcon you can interrupt the target vm with simple requests that will be honoured as soon as possible, and eventually resume it from the point it was suspended and inspected.

Since 0.9 we have introduced even a personalized object model. Falcon instances need not be full blown Falcon objects; the application may provide its own mapping from data to items travelling through the Falcon vm. Compare this with the need of creating a dictionary at each new instance, and having to map each property to a function retrieving data from the host program or from the binded library.

Other classes which you can override are the module loader, which may provide Falcon modules from other type of sources, or from internal storage in embedding applications, and since 0.9 the URI providers. Modules and embedding applications can register their own URI providers, so that opening a module in the app: space would turn into a request to get a module from an internally provided resource, or opening a stream from a script from app: would make possible to communicate binary data via streams to other parts of the application.

Frankly, we did our best to make our engine the most versatile around. They say LUA is very versatile, as you can reprogram it as you wish. But then, that is true for any open source project.

Page break by AutoPager?. Page( 5 ). Goto Window Top Page Up Page Down Goto Window Bottom

How will the new multithreading design in version 0.9 innovate the scripting language panorama?

There are two good reasons why multithreading in scripting languages are delicate matters (that many didn't even want to face). The first is that multithreading can break things. In good multithreading (multithreading which is allowed to actually exploit parallel computational power of modern architectures without excessive overhead), there is no way to recover from an error in a thread. A failing thread is a failing application, and that is a bit problematic to be framed in the controlled execution concept behind scripting language virtual machines.

The second reason is, as LUA developers point out, that a language where a = 0 is not deterministic cannot be proficiently used in multithreading. Some scripting language make a = 0 be deterministic and visible across threads by locking every assignment instruction, and that is a performance killer under many aspects. It doesn't only deplete performance on the script itself, but in case of concurrent programming in an application, it may severely deplete the host application performance by forcing it to unneeded context switches.

We opted for a pure agent based threading model. Each thread runs a separate virtual machine, and communication across threads can happen only through specialised data structures. In this way, each virtual machine can run totally unhindered by global synchronisation. It is possible to share raw memory areas via the MemBuf? item type, or to send complete objects created upon separate elaboration via a interthread item queue.

The point is that, in our opinion, multithreading in scripting languages cannot be seen as multithreading in low level languages, where each operation can be mapped to activities in the underlying parallel CPUs. The idea of mutex/event -based parallel programming is to be rejected in super-high level languages as scripting languages, as there are too many basic operations involved in the simplest instruction. Since, in complex applications written even with low level languages, those primitives are used by low to create higher level communication mechanisms, our view is that multithreading in scripting languages should provide exactly those mechanisms, without trying to force the scripts to do what they cannot proficiently do, that is, low level synchronization primitives.

When I write a server, I find myself struggling to create complex synchronisation rules and structures through those primitives, avoiding to use them directly, and I don't see why we should bestow the same struggle on script users. The realm where primitive synchronisation is useful is not a realm where scripting languages should play a direct role it's where you would want to write a C module to be used from the scripting language anyhow.

In 0.9 we have introduced an inter-thread garbage collector that accounts for objects present in more virtual machines. This is already exploited via the sharing of MemBuf? instances, but we plan to extend this support to other kind of objects. For example, it is currently possible to send a copy of a local object to another thread via an item queue (the underlying data, possibly coming from a module or from an embedding application, can actually be shared; it's just the representation each vm has of the object that must be copied). This makes it a bit difficult to cooperate on creating complete objects across threads, and even if this works in term of agent-based threading, we're planning to use the new interthread GC system to be able to send deep items across threads. Since 0.9, it is already possible to create deep data externally (i.e. in the embedding application or in a module) and send it to a vm in a different thread.

The only problem left in doing it natively across two different vms is ensuring that the source vm won't be allowed to work on the object and on any of the data inside it while the target vm is working on it. Even if this may seem a limitation, it's exactly what the "object monitor" approach to multithreading dictates, and it is perfectly coherent with our view of higher level parallel abstraction. 0.9 version also introduces the mechanism of interthread broadcasts, with message oriented programming extended to interthread operations. We still have to work that out, completely, but that's the reason why we're numbering this release range 0.9 .

Finally, as the vm have native multithread constructs now, we may also drop the necessity to have different vms for different threads, as each thread may just operate on its own local context, while common operations on the vm (as loading new modules) can be easily protected. Still, we need to consider the possibility of multiple vms living in different threads, as this is a useful model for embedding applications.


how to safely pass references:

" http://bartoszmilewski.wordpress.com/2009/03/23/types-for-concurrency/

Can a good type system prevent concurrency errors? Or is this a quest for the Holy Grail?

There are two parts to this question, corresponding to two major types of concurrency errors:

   1. Preventing data races
   2. Preventing deadlocks

I ll start with the first one.

Data races occur only when memory is shared between threads. Disallow sharing and data races are gone! In fact there is a name for threads that don t share memory: processes. It s perfectly feasible to have a concurrent language that disallows sharing Erlang is one (see my post about Erlang). The trick is to always pass data between threads by value. This is especially easy in functional languages.

Non-functional languages like C++, Java, or D (I was told by Walter Bright, the creator of D, to always use the full name, the D programming language, so that search engines can index it properly) tend to share data almost by default (see, however, this post).

In Java, all non-primitive types have reference semantics. When you pass a Java object between threads, you re only passing a handle; the object behind it becomes accessible from both threads.

C++ at least has means to express passing by value and move semantics for user-defined types. Still, it s up to the programmer to use them. Who ordered Guava?

For every type-system idea there is one or more dialects of Java that implement it. I ll start with an older attempt at data-race free Java called Guava, as it illustrates some of the basic premises. -Explicit sharing

The most important step if we don t want to completely ban the sharing of data is to regulate it. Let the programmer explicitly mark the data destined for sharing as such. The corollary is that the data that is not marked for sharing cannot be shared. This can be accomplished, for instance, by making all objects thread-local by default, or by using type modifiers that prevent references to such objects from escaping.

In Guava, the data type designed for sharing is called a Monitor. As the name suggests, all access to a Monitor is automatically synchronized by the Monitor s lock. This, incidentally, eliminates the need for the synchronized keyword, which is absent from Guava.

The non-shared data types are either Objects or Values.

Objects are just like regular Java Objects, except that they don t have a built-in lock, since they can never be shared between threads.

Values are either primitive values, like integers, or user-defined types with value semantics.

Monitors, Objects, and Values are collectively called instances. -Value semantics

When you pass a Value, the compiler will make a deep copy of it (well, sort of the monitors embedded in a Value are not deep copied). Since deep copying might be expensive, Guava defines operator move , which nulls the source. The syntax is:

  v2 = v1--;

The value v1 becomes null after the move to v2. This is similar to C++ unique_ptr and std::move. -Ownership

The biggest problem in lock based concurrency is to make sure that the correct lock(s) are taken when accessing shared data. In Guava, all Monitor s data are protected by that Monitor s lock. As long as they stay inside that Monitor, nothing bad can happen to them from the point of concurrency.

Values stored inside a Monitor are never accessible outside of the Monitor only their copies may be passed out.

The same is not true about Objects. Since Objects have reference semantics, there is a real danger that Objects references might escape the Monitor that protects them. Imagine a situation where two Monitors have references to the same Object. It is possible then that two threads may operate on that Object at the same time one entering through one Monitor, the other through the other Monitor. We have a data race!

Therefore it is important that every Object have an owner at all times. The Object s owner is either a Value or a Monitor. (The exception is a fresh Object that s been just allocated it has no owner until it is assigned to a variable.) Since an Object may only be owned by at most one Monitor, it is that Monitor that protects it from simultaneous (racy) access. -Regions

All Objects that are owned by a particular Monitor or Value form a region. Equivalently, assigning a monitored region to an object specifies what lock must be held when accessing it.

All instances may contain (references to) monitors, but monitors are not owned by anybody. References to the same monitor may appear in multiple regions and may be freely passed around. It is thus up to programmers to define an ordering scheme for their monitors in order to avoid deadlocks.

How can we protect Objects from moving between regions and acquiring multiple owners? We need a way to control aliasing.

Here are some Guava rules for passing Objects. A method may declare its Object parameter as either kept or lent. (By default, parameters to Object methods are kept and to Monitor methods are lent.) If the parameter is kept it must belong to the same region as this, and there are no limitations on its use. If, however, the parameter is lent, the method may not store a reference to it in this, nor may it store this inside a lent Object. No cross-aliasing is allowed.

A method may also be marked new if it returns a freshly created object, which has no owner yet. Constructors are considered new unless they accept kept parameters.

Notice that you may have multiple references to the same Object, but they will all be within the same region. The only instances that may be passed between threads are Monitors and Values. -Constructors

Guava final fields may either be initialized inside a constructor or in a private method that is only called from inside a constructor. (BTW, in D a private method is not private within a module, so the compiler would have to analyze the whole module to establish the latter condition.) [Note to self: The same scheme might be useful in the construction of immutable objects in D.]

Partially constructed Objects must not be visible outside constructors. The compiler must verify that constructors don t pass this to other methods, and don t store this inside other instances (no alias cross-contamination). -Optimizations

Copying Values around may be expensive. I already mentioned one optimization, the use of the move operator. The other optimization is related to immutability. If a Value is immutable, it may be safely passed by reference. Guava defines immutable classes as ones that have no update methods. Any method that may modify this must be marked update. The update notation is also extended to method parameters by default parameters are immutable.

There is a bonus advantage to separating update methods from the rest. In a Monitor, a non-update method may safely use a readers lock, which allows multiple readers to access data simultaneously, to increase concurrency. -Global and local methods

A method is considered local if its effects cannot be observed or influenced by other threads. All Object and Value methods are by default local. A local method is immune to races thus allowing single-thread optimizations.

Conversely, all Monitor methods are considered global, since operations on Monitors may be visible or influenced by other threads.

These defaults may be overridden. For instance, an Object may contain a reference to a Monitor. The methods that call this Monitor s methods must be marked global. Moreover, Object or Value methods that access Monitors that are passed to them as arguments must also be marked global. So touching a Monitor (which is equivalent to using its lock) taints the whole callers tree with the global annotation.

This is similar to the way update taints the callers tree, except the update annotation of a method only pertains to this, not to its arguments. However, when global and update are combined, they have the same tainting power as global. In particular, a method that invokes a global update method on its argument becomes tainted with global update.

Methods that are global update cannot be invoked from non-update methods, even if they only global-update their arguments.

Note: This part of the paper is not very clear to me. The authors never explain the importance of global update methods (other than optimization opportunities). -Limitations

Guava implements a relatively simple and somewhat limited system to eliminate races. It punts the problem of ownership-passing and lock-free programming. Even with those limitations the Guava type system is not simple.

The idea that safe multithreading may be achieved with a few simple modification to the type system seems to be untenable. However, as long as special type annotations are limited to the code that does actual multithreading, I think they re worth the added complexity.

"

more at here's more on his idea: http://bartoszmilewski.wordpress.com/2009/05/26/race-free-multithreading/ :

"

Teaser #1

In one of my previous posts I described a concurrent data structure based on Haskell s MVar. It s essentially a one-element message queue. Let s have a peek at the implementation that uses my proposed multithreaded scheme. Notice the total absence of special annotations even the synchronized keyword is missing. The only unusual thing is the move operator, :=, introduced in my previous blog. It is needed in case we want to pass unique objects through MVar. You are probably asking yourself, How the heck is this supposed to work in a multithreaded environment? Read on!

class MVar<T> { private: T _msg; bool _full; public: put: asynchronous (non-blocking) Precondition: MVar must be empty void put(T msg) { assert (!_full); _msg := msg; move _full = true; notify(); } take: If empty, blocks until full. Removes the message and switches state to empty T take() { while (!_full) wait(); _full = false; return := _msg; } }

Let s instantiate this template as a monitor that is an object accessible from multiple threads. We do it by specifying the owner as self (the this owner is not explicitly listed as a template parameter, but it can be passed to the template during instantiation).

auto mVar = new MVar<owner::self, int>;

In a self-owned object all methods are automatically synchronized. The move operator acting on an int simply copies it.

That was easy! How about instantiating a self-owned MVar for passing unique objects? Piece of cake!

auto q = new MVar<owner::self, unique Foo>;

auto foo = new unique Foo; q.put(:= foo); move foo assert (foo is null);

another thread unique Foo f2 = q.get(); implicit move of rvalue

So far data races have been avoided by implicit synchronization in self-owned objects and by the use of move semantics for unique objects.

Of course you know how to break the MVar, right? Instantiate it with a regular, non-unique class objects and watch aliases spread through threads. Let s try it:

auto mVar = new MVar<owner::self, Foo>; auto myFoo = new Foo; mVar.put(myFoo); now let me race through my local alias! myFoo.method();

Surprise! This code will not compile because a self-owned object cannot have a thread-local member. Because of a richer type system, the compiler can easily detect such violations. I ll explain the details of the ownership scheme in my future posts for now let me assure you that, no matter how hard you try, you can t create a data race in this system (unless you bypass it with explicit casts). How to making concurrent programming safer?

I propose to do this by extending the type system. (You ve already seen some examples above.) I realize that there is strong resistance to extending a language s type system. The main problem is increased complexity. The programmers are reluctant to study and use complex annotation schemes. I will argue that, in the case of multithreading, the complexity argument is reversed. Shared-memory multithreading is hard! Figuring out how to avoid all the pitfalls of sharing is harder than learning a type system that prevents them.

If a programmer is not willing to learn or cannot understand the race-free type system, he or she should not be writing multithreaded programs. How to limit complexity?

To limit the complexity of the type system, I came up with several simple principles. The most important one is that multithreaded extensions should not require modifications to single-threaded programs. That means coming up with reasonable defaults for all new annotations.

If necessary, the compiler should be able to derive some of the defaults from program analysis. Because modularity is very important, such analysis should not creep across compilation unit boundaries. Whole-program analysis is out of the question. How to keep goals reasonable?

There are two major challenges in multithreaded programming:

   1. Avoiding races
   2. Preventing deadlocks

The first problem is approachable, whereas the second seems to be a pie in the sky. I will therefore concentrate on designing a race-free system that is based on existing academic research.

Such system would be incomplete without some support for lock-free programming. There are very good reasons for the language to enforce sequential consistency. For comparison, C++ committee, after a long struggle, decided to allow non-SC primitives (weak atomics). Java only enforces SC for volatile variables. Teaser #2

Speaking of lock-free programming, here s the implementation of the infamous Double-Checked Locking Pattern:

class Singleton<T> { private: lockfree T _value; public: T get() lockfree { if (_value is null) { synchronized(this) { if (_value is null) _value = new T; } return _value; } } }

A lockfree variable is guaranteed to be always atomically accessed in a sequentially consistent way. A lockfree method is not synchronized, but it can only operate on lockfree variables (or use a synchronized section). Even though the use of lockfree doesn t introduce low-level races, it may create high-level races when accessing more that one lockfree variable at a time. But we all know that lock free programming is only for desperadoes. Lessons learned from previous work

There are many commonalities in various approaches to race freedom.

Most of those ideas are expressible through type qualifiers. Higher-level models

Explicit synchronization is hard, no doubt about it. In my proposal, the type system makes it safe. There are however other models of concurrency that, although more restrictive, are often easier to use.

One such model is based on message passing. In my opinion, support for message passing should be library-based. The race-free type system will make it safe and flexible. It will, for instance, support passing messages not only by value but also by reference without sacrificing safety. In traditional type systems there is no way to express the requirement that a message passed by reference must either be a monitor itself (have it s own lock) or behave like a C++ unique_ptr (an object that leaves no aliases behind). This requirement should be expressible through the type system, allowing the compiler to check for its violations.

I ve been paying a lot of attention to software transactional memory (STM). I believe that STM could be built into the language again, with the support of the type system. It looks like the hardest problem with STM is how it should inter-operate with other types of multithreaded access (both locked and lock-free). The simplest solution is to enforce full isolation no STM object can ever be accessed outside of a transaction. It s not clear how practical such approach is, but it definitely simplifies things to the point that STM becomes orthogonal to other types of synchronization. And that means it will be possible to tackle it at a later time, after the race-free multithreading model is firmly established.

In the next installment, I will describe the ownership scheme that is the foundation of the race-free type system.

Please vote for this article on reddit. Bibliography

C. Flanagan, M. Abadi, Object Types against Races. The seminal paper introducing ownership-based type systems.

See also my previous posts on Guava and GRFJ that discuss race free dialects of Java. "

java "synchronized" methods: monitors java "volatile" variables: accesses to this variable are automatically controlled by a lock. since you can only hold the lock during one read or write, you can't get a deadlock while waiting for it

http://www.javamex.com/tutorials/synchronization_concurrency_1.shtml

idea: make everything in one line of code atomic (acquire all locks atomically at the beginning of the line, and then use them all, and then release them all at the end). avoids this problem: http://stackoverflow.com/questions/461896/what-is-the-most-frequent-concurrency-problem-youve-encountered-in-java/466334#466334 "It can be easy to think synchronized collections grant you more protection than they actually do, and forget to hold the lock between calls. If have seen this mistake a few times:

 List<String> l = Collections.synchronizedList(new ArrayList<String>());
 String[] s = l.toArray(new String[l.size()]);

For example, in the second line above, the toArray and size() methods are both thread safe in their own right, but the size() is evaluated separately from the toArray(), and the lock on the List is not held between these two calls. If you run this code with another thread concurrently removing items from the list, sooner or later you will end up with a new String[] returned which is larger than required to hold all the elements in the list, and has null values in the tail. It is easy to think that because the two method calls to the List occur in a single line of code this is somehow an atomic operation, but it is not. "

in the Java Memory Model (JMM), variables which are not volatile and not accessed in synchronized methods apparently aren't required to be updated except thread-locally, i.e. http://stackoverflow.com/questions/461896/what-is-the-most-frequent-concurrency-problem-youve-encountered-in-java/477729#477729 " vote up 6 vote down

Until I took a class with Brian Goetz I didn't realize that the non-synchronized getter of a private field mutated through a synchronized setter is never guaranteed to return the updated value. Only when a variable is protected by synchronized block on both reads AND writes will you get the guarantee of the latest value of the variable.

public class SomeClass?{ private Integer thing = 1;

    public synchronized void setThing(Integer thing)
        this.thing = thing;
    }
    /**
     * This may return 1 forever and ever no matter what is set
     * because the read is not synched
     */
    public Integer getThing(){
        return thing;  
    }}

"

why aren't variables "volatile" by default in java? inefficiency, i guess?

http://stackoverflow.com/questions/461896/what-is-the-most-frequent-concurrency-problem-youve-encountered-in-java

copied here b/c its so useful:

"

What is the most frequent concurrency problem you ve encountered in Java? vote up 32 vote down star 31

This is a poll of sorts about common concurrency problems in Java. An example might be the classic deadlock or race condition or perhaps EDT threading bugs in Swing. I'm interested both in a breadth of possible issues but also in what issues are most common. So, please leave one specific answer of a Java concurrency bug per comment and vote up if you see one you've encountered. Thanks! concurrency multithreading java bugs polls flag

edited Jan 20 at 16:39 Alex Miller

community wiki

2 revisions

47 Answers oldest newest votes 1 2 next vote up 29 vote down

The most common concurrency problem I've seen, is not realizing that a field written by one thread is not guaranteed to be seen by a different thread. A common application of this:

class MyThread? extends Thread { private boolean stop = false;

  public void run() {
    while(!stop) {
      doSomeWork();
    }
  }
  public void setStop() {
    this.stop = true;
  }}

As long as stop is not volatile or setStop is not synchronized this is not guaranteed to work. This mistake is especially devilish as in 99.999% it won't matter in practice as the reader thread will eventually see the change - but we don't know how soon he saw it. link

flag

answered Jan 20 at 19:09 Kutzi

community wiki

2 A great solution to this is to make the stop instance variable an AtomicBoolean?. It solves all the problems of the non-volatile, while shielding you from the JMM issues. Kirk Wylie Jan 20 at 23:59 4 It's worse than 'for several minutes' -- you might NEVER see it. Under the Memory Model, the JVM is allowed to optimize while(!stop) into while(true) and then you're hosed. This may only happen on some VMs, only in server mode, only when the JVM recompiles after x iterations of the loop, etc. Ouch! Cowan Feb 11 at 6:15 show 6 more comments vote up 24 vote down

My #1 most painful concurrency problem ever occurred when two different open source libraries did something like this:

private static final String LOCK = "LOCK"; use matching strings in two different libraries

public doSomestuff() { synchronized(LOCK) { this.work(); } }

At first glance, this looks like a pretty trivial synchronization example. However; because Strings are interned in Java, the literal string "LOCK" turns out to be the same instance of java.lang.String (even though they are declared completely disparately from each other.) The result is obviously bad. link

flag

answered Jan 20 at 22:44 Jared

community wiki

4 This is one of the reasons why I prefer private static final Object LOCK = new Object(); Andrzej Doyle Jan 21 at 9:59 2 I love it - oh, this is nasty :) Thorbjørn Ravn Andersen Jan 21 at 16:40 2 That's a good one for Java Puzzlers 2. Dov Wasserman Jan 29 at 18:30 1 Actually...it really makes me want the compiler to refuse to allow you to synchronize on a String. Given String interning, there is no case where that would be a "good thing(tm)". Jared Feb 4 at 15:30 1 @Jared: "until the string is interned" makes no sense. Strings don't magically "become" interned. String.intern() returns a different object, unless you already have the canonical instance of the specified String. Also, all literal strings and string-valued constant expressions are interned. Always. See the docs for String.intern() and §3.10.5 of the JLS. Laurence Gonsalves Aug 6 at 4:21 show 5 more comments vote up 23 vote down

A common problem is using classes like Calendar and SimpleDateFormat? from multiple threads (often by caching them in a static variable) without synchronization. These classes are not thread-safe so multi-threaded access will ultimately cause strange problems with inconsistent state. link

flag

answered Jan 20 at 16:17 Alex Miller

community wiki

vote up 19 vote down

One classic problem is changing the object you're synchronizing on while synchronizing on it:

synchronized(foo) { foo = ... }

Other concurrent threads are then synchronizing on a different object and this block does not provide the mutual exclusion you expect. link

flag

answered Jan 20 at 16:02 Alex Miller

community wiki

1 Ha...now that's a tortured description. "unlikely to have useful semantics" could better be described as "most likely broken". :) Alex Miller Jan 20 at 16:45 show 5 more comments vote up 15 vote down

Though probably not exactly what you are asking for, the most frequent concurrency-related problem I've encountered (probably because it comes up in normal single-threaded code) is a

java.util.ConcurrentModificationException?

caused by things like:

List<String> list = new ArrayList?<String>(Arrays.asList("a", "b", "c")); for (String string : list) { list.remove(string); }

link

flag

answered Jan 20 at 16:14 Fabian Steeg

community wiki

show 1 more comment vote up 13 vote down

Not properly synchronizing on objects returned by Collections.synchronizedXXX(), especially during iteration or multiple operations:

Map<String, String> map = Collections.synchronizedMap(new HashMap?<String, String>());

...

if(!map.containsKey("foo")) map.put("foo", "bar");

That's wrong. It should be:

synchronized(map) { if(!map.containsKey("foo")) map.put("foo", "bar"); }

Or with a ConcurrentMap? implementation:

map.putIfAbsent("foo", "bar");

link

flag

edited Jan 20 at 16:43 Alex Miller

community wiki

2 revisions, 2 users

show 1 more comment vote up 13 vote down

Double-Checked Locking. By and large.

The paradigm, which I started learning the problems of when I was working at BEA, is that people will check a singleton in the following way:

public Class MySingleton? { private static MySingleton? s_instance; public static MySingleton? getInstance() { if(s_instance == null) { synchronized(MySingleton?.class) { s_instance = new MySingleton?(); } } return s_instance; } }

This never works, because another thread might have gotten into the synchronized block and s_instance is no longer null. So the natural change is then to make it:

  public static MySingleton getInstance() {
    if(s_instance == null) {
      synchronized(MySingleton.class) {
        if(s_instance == null) s_instance = new MySingleton();
      }
    }
    return s_instance;
  }

That doesn't work either, because the Java Memory Model doesn't support it. You need to declare s_instance as volatile to make it work, and even then it only works on Java 5.

People that aren't familiar with the intricacies of the Java Memory Model mess this up all the time. link

flag

answered Jan 20 at 17:06 Kirk Wylie

community wiki

show 6 more comments vote up 11 vote down

Forgetting to wait() (or Condition.await()) in a loop, checking that the waiting condition is actually true. Without this, you run into bugs from spurious wait() wakeups. Canonical usage should be:

 synchronized (obj) {
     while (<condition does not hold>) {
         obj.wait();
     }
     // do stuff based on condition being true
 }

link

flag

answered Jan 20 at 16:25 Alex Miller

community wiki

vote up 11 vote down

Another common bug is poor exception handling. When a background thread throws an exception, if you don't handle it properly, you might not see the stack trace at all. Or perhaps your background task stops running and never starts again because you failed to handle the exception. link

flag

answered Jan 20 at 17:06 Eric Burke

community wiki

show 1 more comment vote up 7 vote down

The most common bug we see where I work is programmers perform long operations, like server calls, on the EDT, locking up the GUI for a few seconds and making the app unresponsive. link

flag

answered Jan 20 at 17:01 Eric Burke

community wiki

show 1 more comment vote up 6 vote down

My biggest problem has always been deadlocks, especially caused by listeners that are fired with a lock held. In these cases, it's really easy to get inverted locking between two threads. In my case, between a simulation running in one thread and a visualization of the simulation running in the UI thread.

EDIT: Moved second part to separate answer. link

flag

edited Jan 20 at 16:11 Dave Ray

community wiki

2 revisions

show 2 more comments vote up 6 vote down

Mutable classes in shared data structures

Thread1: Person p = new Person("John"); sharedMap.put("Key", p); assert(p.getName().equals("John"); sometimes passes, sometimes fails

Thread2: Person p = sharedMap.get("Key"); p.setName("Alfonso");

When this happens, the code is far more complex that this simplified example. Replicating, finding and fixing the bug is hard. Perhaps it could be avoided if we could mark certain classes as immutable and certain data structures as only holding immutable objects. link

flag

answered Jan 20 at 18:05 Steve McLeod?

community wiki

vote up 6 vote down

It can be easy to think synchronized collections grant you more protection than they actually do, and forget to hold the lock between calls. If have seen this mistake a few times:

 List<String> l = Collections.synchronizedList(new ArrayList<String>());
 String[] s = l.toArray(new String[l.size()]);

For example, in the second line above, the toArray and size() methods are both thread safe in their own right, but the size() is evaluated separately from the toArray(), and the lock on the List is not held between these two calls. If you run this code with another thread concurrently removing items from the list, sooner or later you will end up with a new String[] returned which is larger than required to hold all the elements in the list, and has null values in the tail. It is easy to think that because the two method calls to the List occur in a single line of code this is somehow an atomic operation, but it is not. link

flag

edited Jan 21 at 18:08 Nick

community wiki

2 revisions

show 1 more comment vote up 6 vote down

Until I took a class with Brian Goetz I didn't realize that the non-synchronized getter of a private field mutated through a synchronized setter is never guaranteed to return the updated value. Only when a variable is protected by synchronized block on both reads AND writes will you get the guarantee of the latest value of the variable.

public class SomeClass?{ private Integer thing = 1;

    public synchronized void setThing(Integer thing)
        this.thing = thing;
    }
    /**
     * This may return 1 forever and ever no matter what is set
     * because the read is not synched
     */
    public Integer getThing(){
        return thing;  
    }}

link

flag

answered Jan 25 at 14:07 John Russell

community wiki

show 2 more comments vote up 5 vote down

Multiple objects that are lock protected but are commonly accessed in succession. We've run into a couple of cases where the locks are obtained by different code in different orders, resulting in deadlock. link

flag

answered Jan 20 at 16:15 bcash

community wiki

vote up 5 vote down

Thinking you are writing single-threaded code, but using mutable statics (including singletons). Obviously they will be shared between threads. This happens surprisingly often. link

flag

answered Jan 20 at 16:45 Tom Hawtin - tackline

community wiki

show 1 more comment vote up 5 vote down

Another common 'concurrency' issue is to use synchronized code when it is not necessary at all. For example I still see programmers using StringBuffer? or even java.util.Vector (as method local variables). link

flag

answered Jan 20 at 19:27 Kutzi

community wiki

show 1 more comment vote up 4 vote down

The dumbest mistake I frequently make is forgetting to synchronize before calling notify() or wait() on an object. link

flag

answered Jan 20 at 16:12 Dave Ray

community wiki

show 2 more comments vote up 4 vote down

I encountered a concurrency problem with Servlets, when there are mutable fields which will be setted by each request. But there is only one servlet-instance for all request, so this worked perfectly in a single user environment but when more than one user requested the servlet unpredictable results occured.

public class MyServlet? implements Servlet{ private Object something;

    public void service(ServletRequest request, ServletResponse response)
        throws ServletException, IOException{
        this.something = request.getAttribute("something");
        doSomething();
    }
    private void doSomething(){
        this.something ...
    }}

link

flag

answered Jan 20 at 17:27 Ludwig Wensauer

community wiki

vote up 4 vote down

Using a local "new Object()" as mutex.

synchronized (new Object()) { System.out.println("sdfs"); }

This is useless. link

flag

answered Jan 20 at 17:38 Ludwig Wensauer

community wiki

1 This is probably useless, but the act of synchronizing at all does some interesting things... Certainly creating a new Object every time is a complete waste. TREE Jan 22 at 15:59 vote up 4 vote down

Arbitrary method calls should not be made from within synchronized blocks.

Dave Ray touched on this in his first answer, and in fact I also encountered a deadlock also having to do with invoking methods on listeners from within a synchronized method. I think the more general lesson is that method calls should not be made "into the wild" from within a synchronized block - you have no idea if the call will be long-running, result in deadlock, or whatever.

In this case, and usually in general, the solution was to reduce the scope of the synchronized block to just protect a critical private section of code.

Also, since we were now accessing the Collection of listeners outside of a synchronized block, we changed it to be a copy-on-write Collection. Or we could have simply made a defensive copy of the Collection. The point being, there are usually alternatives to safely access a Collection of unknown objects. link

flag

answered Jan 20 at 19:12 Scott Bale

community wiki

vote up 4 vote down

I believe in the future the main problem with Java will be the (lack of) visibility guarantees for constructors. For example, if you create the following class

class MyClass? { public int a = 1; }

and then just read the value MyClass?.a from another thread, MyClass?.a could be either 0 or 1, depending on the JavaVM?'s implementation and mood. Today the chances for 'a' being 1 are very high. But on future NUMA machines this may be different. Many people are not aware of this and believe that they don't need to care about multi-threading during the initialization phase. link

flag

answered Jan 20 at 19:24 Tim Jansen

community wiki

show 3 more comments vote up 4 vote down

Synchronizing on a string literal or constant defined by a string literal is (potentially) a problem as the string literal is interned and will be shared by anyone else in the JVM using the same string literal. I know this problem has come up in application servers and other "container" scenarios.

Example:

private static final String SOMETHING = "foo";

synchronized(SOMETHING) { }

In this case, anyone using the string "foo" to lock on is sharing the same lock. link

flag

answered Jan 20 at 20:55 Alex Miller

community wiki

show 2 more comments vote up 4 vote down

Not exactly a bug but, the worst sin is providing a library you intend other people to use, but not stating which classes/methods are thread-safe and which ones must only be called from a single thread etc.

More people should make use of the concurrency annotations (e.g. @ThreadSafe?, @GuardedBy? etc) described in Goetz's book. link

flag

answered Jan 21 at 11:22 Neil Bartlett

community wiki

vote up 4 vote down

The most recent Concurrency-related bug I ran into was an object that in its constructor created an ExecutorService?, but when the object was no longer referenced, it had never shutdown the ExecutorService?. Thus, over a period of weeks, thousands of threads leaked, eventually causing the system to crash. (Technically, it didn't crash, but it did stop functioning properly, while continuing to run.)

Technically, I suppose this isn't a concurrency problem, but it's a problem relating to use of the java.util.concurrency libraries. link

flag

answered Jan 22 at 5:25 Eddie

community wiki

vote up 3 vote down

Unbalanced synchronization, particularly against Maps seems to be a fairly common problem. Many people believe that synchronizing on puts to a Map (not a ConcurrentMap?, but say a HashMap?) and not synchronizing on gets is sufficient. This however can lead to an infinite loop during re-hash.

The same problem (partial synchronization) can occur anywhere you have shared state with reads and writes however. link

flag

edited Jan 20 at 16:22 Alex Miller

community wiki

2 revisions

vote up 3 vote down

Not realising the java.awt.EventQueue?.invokeAndWait acts as if it holds a lock (exclusive access to the Event Dispatch Thread, EDT). The great thing about deadlocks is that even if that happens rarely you can grab a stack trace with jstack or the like. I've seen this in a number of widely used programs (a fix to a problem I have only seen occur once in Netbeans should be included in the next release). link

flag

edited Jan 20 at 16:58 Tom Hawtin - tackline

community wiki

2 revisions

vote up 2 vote down

Use of a global object such as a static variable for locking.

This leads to very bad performance because of contention. link

flag

answered Jan 20 at 16:03 kohlerm

community wiki

show 2 more comments vote up 2 vote down

Honesly? Prior to the advent of java.util.concurrent, the most common problem I routinely ran into was what I call "thread-thrashing": Applications that use threads for concurrency, but spawn too many of them and end up thrashing. link

flag

answered Jan 20 at 16:11 Brian Clapper

community wiki

show 1 more comment vote up 2 vote down

Race conditions during an object's finalize/release/shutdown/destructor method and normal invocations.

From Java, I do a lot of integration with resources that need to be closed, such as COM objects or Flash players. Developers always forget to do this properly and end up having a thread call an object that has been shutdown. link

flag

answered Jan 20 at 16:31 hamletdarcy

community wiki

"

http://en.wikipedia.org/wiki/Concurrency_pattern

The active object design pattern decouples method execution from method invocation that reside in their own thread of control.[1] The goal is to introduce concurrency, by using asynchronous method invocation and a scheduler for handling requests.[2]

The pattern consists of six elements:[3]


The balking pattern is a software design pattern that only executes an action on an object when the object is in a particular state. For example, if an object reads ZIP files and a calling method invokes a get method on the object when the ZIP file is not open, the object would "balk" at the request. In the Java programming language, for example, an IllegalStateException? might be thrown under these circumstances.

There are some[who?] in this field that think this is more of an anti-pattern, than a design pattern.


http://en.wikipedia.org/wiki/Double_checked_locking_pattern

Double-checked locking From Wikipedia, the free encyclopedia (Redirected from Double checked locking pattern) Jump to: navigation, search

In software engineering, double-checked locking is a software design pattern also known as "double-checked locking optimization[1]". The pattern is designed to reduce the overhead of acquiring a lock by first testing the locking criterion (the "lock hint") in an unsafe manner; only if that succeeds does the actual lock proceed.

The pattern, when implemented in some language/hardware combinations, can be unsafe. It can therefore sometimes be considered to be an anti-pattern.

It is typically used to reduce locking overhead when implementing "lazy initialization" in a multi-threaded environment, especially as part of the Singleton pattern. Lazy initialization avoids initializing a value until the first time it is accessed. Contents [hide]

[edit] Usage in Java

Consider, for example, this code segment in the Java programming language as given by [3] (as well as all other Java code segments):

Single threaded version class Foo { private static Helper helper = null;

    private Helper(){
 
 
    }
 
    public static Helper getHelper() {
        if (helper == null)
            helper = new Helper();
        return helper;
    }
 
    // other functions and members...}

The problem is that this does not work when using multiple threads. A lock must be obtained in case two threads call getHelper() simultaneously. Otherwise, either they may both try to create the object at the same time, or one may wind up getting a reference to an incompletely initialized object. This is done by synchronizing, as is shown in the following example.

Correct but possibly expensive multithreaded version class Foo { private Helper helper = null; public synchronized Helper getHelper() { if (helper == null) helper = new Helper(); return helper; }

    // other functions and members...}

However, the first call to getHelper() will create the object and only the few threads trying to access it during that time need to be synchronized; after that all calls just get a reference to the member variable. Since synchronizing a method can decrease performance by a factor of 100 or higher[2], the overhead of acquiring and releasing a lock every time this method is called seems unnecessary: once the initialization has been completed, acquiring and releasing the locks would appear unnecessary. Many programmers have attempted to optimize this situation in the following manner:

   1. Check that the variable is initialized (without obtaining the lock). If it is initialized, return it immediately.
   2. Obtain the lock.
   3. Double-check whether the variable has already been initialized: if another thread acquired the lock first, it may have already done the initialization. If so, return the initialized variable.
   4. Otherwise, initialize and return the variable.

Broken multithreaded version "Double-Checked Locking" idiom class Foo { private Helper helper = null; public Helper getHelper() { if (helper == null) { synchronized(this) { if (helper == null) { helper = new Helper(); } } } return helper; }

    // other functions and members...}

Intuitively, this algorithm seems like an efficient solution to the problem. However, this technique has many subtle problems and should usually be avoided. For example, consider the following sequence of events:

   1. Thread A notices that the value is not initialized, so it obtains the lock and begins to initialize the value.
   2. Due to the semantics of some programming languages, the code generated by the compiler is allowed to update the shared variable to point to a partially constructed object before A has finished performing the initialization.
   3. Thread B notices that the shared variable has been initialized (or so it appears), and returns its value. Because thread B believes the value is already initialized, it does not acquire the lock. If the variable is used before A finishes initializing it, the program will likely crash.

One of the dangers of using double-checked locking in J2SE 1.4 (and earlier versions) is that it will often appear to work: it is not easy to distinguish between a correct implementation of the technique and one that has subtle problems. Depending on the compiler, the interleaving of threads by the scheduler and the nature of other concurrent system activity, failures resulting from an incorrect implementation of double-checked locking may only occur intermittently. Reproducing the failures can be difficult.

As of J2SE 5.0, this problem has been fixed. The volatile keyword now ensures that multiple threads handle the singleton instance correctly. This new idiom is described in [4]:

Works with acquire/release semantics for volatile Broken under Java 1.4 and earlier semantics for volatile class Foo { private volatile Helper helper = null; public Helper getHelper() { if (helper == null) { synchronized(this) { if (helper == null) helper = new Helper(); } } return helper; }

    // other functions and members...}

Many versions of the double-checked locking idiom that do not use explicit methods such as volatile or synchronization to communicate the construction of the object have been proposed, and all of them are wrong.[3][4] [edit] Usage in Microsoft Visual C++

Double-checked locking can be implemented in Visual C++ 2005 if the pointer to the resource is declared with the C++ keyword volatile. Visual C++ 2005 guarantees that volatile variables will behave as fences, as in J2SE 5.0, preventing both compiler and CPU arrangement of reads and writes[citation needed]. There is no such guarantee in previous versions of Visual C++. However, marking the pointer to the resource as volatile may harm performance elsewhere, if the pointer declaration is visible elsewhere in code, by forcing the compiler to treat it as a fence elsewhere, even when it is not necessary. "


http://en.wikipedia.org/wiki/Guarded_suspension " In concurrent programming, guarded suspension[1] is a software design pattern for managing operations that require both a lock to be acquired and a precondition to be satisfied before the operation can be executed. The guarded suspension pattern is typically applied to method calls in object-oriented programs, and involves suspending the method call, and the calling thread, until the precondition (acting as a guard) is satisfied. Contents [hide]

[edit] Usage

Because it is blocking, the guarded suspension pattern is generally only used when the developer knows that a method call will be suspended for a finite and reasonable period of time. If a method call is suspended for too long, then the overall program will slow down or stop, waiting for the precondition to be satisfied. If the developer knows that the method call suspension will be indefinite or for an unacceptably long period, then the balking pattern may be preferred. [edit] Implementation

In Java, the Object class provides the wait() and notify() methods to assist with guarded suspension. In the implementation below, originally found in Kuchana (2004), if there is no precondition satisfied for the method call to be successful, then the method will wait until it finally enters a valid state.

public class Example { synchronized void guardedMethod() { while (!preCondition()) { try { Continue to wait wait(); & } catch (InterruptedException? e) { & } } Actual task implementation } synchronized void alterObjectStateMethod() { Change the object state &.. Inform waiting threads notify(); } }

An example of an actual implementation would be a queue object with a get method that has a guard to detect when there are no items in the queue. Once the "put" method notifies the other methods (for example, a get() method), then the get() method can exit its guarded state and proceed with a call. Once the queue is empty, then the get() method will enter a guarded state once again. [edit] See also


"A read/write lock pattern or simply RWL is a software design pattern that allows concurrent read access to an object but requires exclusive access for write operations."

---

"In computer programming, the scheduler pattern is a software design pattern. It is a concurrency pattern used to explicitly control when threads may execute single-threaded code.

The scheduler pattern uses an object that explicitly sequences waiting threads. It provides a mechanism to implement a scheduling policy, but is independent of any specific scheduling policy the policy is encapsulated in its own class and is reusable.

The read/write lock pattern is usually implemented using the scheduler pattern to ensure fairness in scheduling.

"


"In computer programming, the thread pool pattern is where a number of threads are created to perform a number of tasks, which are usually organized in a queue. Typically, there are many more tasks than threads. As soon as a thread completes its task, it will request the next task from the queue until all tasks have been completed. The thread can then terminate, or sleep until there are new tasks available."


"Thread-local storage (TLS) is a computer programming method that uses static or global memory local to a thread.

This is sometimes needed because all threads in a process share the same address space. In other words, data in a static or global variable is normally always located at the same memory location, when referred to by threads from the same process. Variables on the stack however are local to threads, because each thread has its own stack, residing in a different memory location.

Sometimes it is desirable that two threads referring to the same static or global variable are actually referring to different memory locations, thereby making the variable thread local, a canonical example being the C error code variable errno.

"


"The reactor design pattern is a concurrent programming pattern for handling service requests delivered concurrently to a service handler by one or more inputs. The service handler then demultiplexes the incoming requests and dispatches them synchronously to the associated request handlers.

[edit] "

stm primitives: "atomic" blocks "retry" command

oop stm in middle/end of: http://s3.amazonaws.com/dconf2007/DSTM.ppt


" [reply] Re: (OT) Programming languages for multicore computers by BrowserUk? (Saint) on May 06, 2009 at 10:07 UTC

      The following was started 2 days ago, but I've been unable to finish it. It was suggested to me I post it as-is anyway, and if there is sufficient interest, I'll try to complete it later.
      My first response to almost every question you've asked, is: It depends.
      It mostly depends upon the requirements of the algorithm and/or data that needs processing. There are essentially 3 basic types of parallelism (with many variations):
         1. IO parallelism:
            The simplest of them all. Mostly IO-bound.
            Typified by webservers and the like. Each execution context is essentially independent of all others.
            Each execution context spends most of its time waiting (a device or communications partner) for something to do. And when it does get something to do, it does it relatively quickly and then goes back to waiting.
         2. Task parallelism:
            Potentially the most complex. A mixture of IO and cpu.
            Typified by the Google Map-Reduce architecture. (Though that is (currently) applied at the macro (box) level).
            Each execution context runs a different algorithm on independent data, but they need to coordinate and synchronise between each other. Here the algorithms (tasks) are overlapped by passing the stream of data from one stage to the next in smallish chunks. Think of a pipeline where one stream of data needs to be processed by several different algorithms, but serially--read; decode; transform; encode; write.
         3. Data parallelism:
            Potentially the biggest gains. CPU-bound.
            Typified by image (X-ray; MRI;etc.) processing.
            One large homogeneous dataset needs a one or two, often simple algorithms applied to the entire dataset. Small subset of the dataset are assigned to each execution context perform the same algorithm(s) on. 
         1. This type lends itself to event driven/state machine/select methods.
            The problems arise when the process also need to perform some cpu-intensive processing in addition to the IO-driven processing. This necessitates the programmer injecting artificial breaks into the cpu-intensive code in order to ensure timely servicing of the IO events.
            Whilst not onerous for a single single program with exclusive use of the hardware, it becomes necessary to re-tune the software for each cpu; OS; end even application mix; that ths code will run on. making for costly porting and on-going maintenance.
         2. As mentioned above, this type is currently being dealt with at the macro-level.
            Using clusters of commodity hardware and disk-based 'pipelines' for shared storage. Whilst reasonably effective for the last, current, and possibly next generations of 1, 2, and 4-core commodity boxes with 2 to 4 GB of ram, the cost of all the disk-IO between stages will start to make itself felt as we move to larger numbers of cores and ram per box.
            Once you have cores ranging from 8 to 32; multiplied by simultaneous threading per core of 2 to 8; and anything from 64 to 512GB per box, it makes less and less sense to use processes and disk-based storage to handle the pipelines.
            When the machine can store an entire partition of the dataset entirely in memory, it is far quicker to load the dataset into memory once, and have each stage of the pipeline operate on it in place. You could do this by running each stage over the dataset serially, but as with the current scheme, handing over smallish chunks from stage to stage allows you to overlap the stages and vastly reduce the end-to-end processing time. So, running all the stages as threads, bucket-chaining over a single, large shared dataset is a natural evolution of the processes and disks model that brings huge efficiency saving for negligible extra infrastructure costs.
            Just a single shared integer between each stage that is incremented by the previous stage and read-only to the following stage. Indicating how far the previous stage has made it through the dataset, and where the following stage continues it next cycle. Utilises only simple, two-party condition signalling with no possibility of dead-locks, priority inversions or any of those other scare-monger nasties.
            Huge performance gains secured from avoiding inter-stage pipeline IO costs.
         3. This type is already happening in games and Computer Generated Imagery.
            It simply doesn't make sense to partition these kinds of datasets (images etc.) across processes, let alone boxes. The shared-memory model and threading, is the only way to go. But again, the "big problems" of the shared memory model--deadlocking; priority inversion etc.--do not arise, because the each thread operates exclusively on its own partition of the overall dataset.
            By linearly or recursively partitioning the dataset, no locking is required and each thread is free to run full-speed on its part of the problem to completion. The only synchronisation involved is the master thread waiting for the workers to finish before it does whatever needs doing with the results.
            More and more, this kind of processing will be offloaded to specialist CPUs (dsps; gpus)(hereafter SPUs), for processing. However, with current setups this involves the transfer of data from cpu memory to SPUs memory and back again. And with potentially multiple processes needing the SPU's services, and with memory access already the bottleneck on performance, it will make more sense in the near future for Spus to do their thing by directly accessing the data in the main memory. Effectively making the SPUs cores extensions of the main cpu. We're already seeing this with SIMD and similar instruction sets. 
      The future is threaded. We are just waiting for the software to catch up with the hardware. And I fear we are going to have to wait for the next generation of programmers before that problem will be properly fixed.
      Just as many of my generation have problems with using SMS--and with the language forms it generates--whilst it is simpler than tying shoelaces for the new generations. So it is with threading. Many in my generation only see the problems involved--not the potential.
      Just as it tool the ubiquity of the web to drive the transition from th request-response model to the upcoming Web 2 era. So, it will require the ubiquity of threads and cores to drive the transition from forks&pipes to threads&shared state.It may take until the retirement of the Luddite generation for it to happen. But it will happen.
      As you've rightly indicated, one of the primary drivers of the transition will be the evolution of computer languages that give easy--and well abstracted--access to the potential. Whilst many of those you cite are a adept at one or more of the types of concurrency, none of them are truly adept at all of them. And problems arise because real-world problems tend to require two or more of those types in the same application.
      A secondary problem with most existing language abstractions of concurrency is that they tend to take one of two tacts:
          o A low-level approach--typified by C/C++ and libraries like Intel Threading Building Blocks--whereby the programmer is given access to, and required to exercise, full control over all aspects of the threading.
            This not just enables, but forces the programmer to deal with all the issues of sharing state, not just for that data that needs to be shared, but all state! And that places the onus upon the programmer to ensure the segregation of per stage (per thread) state. A time consuming and rigorous task much better suited to automation by the compiler.
          o A high-level, encapsulated approach--typified by most of the concurrent FP languages like Haskell and Erlang--which removes most of the control from the programmers hands.
            The problem here is that the FP (and even const correct procedural) languages prevent the programmer (at least without resorting to extraordinary procedures (with often "dangerous" sounding names)), from sharing state for those data that need to be shared, with the result that the data has to be needlessly and expensively copied, often many times, as it transitions through multi-stage pipelined processes; or as multiple workers concurrently do the same processing on their own small chunks of the total dataset, and then they are 're-assembled' back to a whole.
            This compiler enforced (and needless) replication of state becomes a huge drain upon system resources via memory thrash, IO and communications protocols overheads. This can turn O(n) algorithms in to O(n^2) algorithms (or worse), once those overheads are factored into the equations. That detracts from, and can even completely negate, the benefits of concurrency. Add in the extra costs of concurrent development, maintenance and testing, and it is easy to see why people see little to be gained from the concurrency. 
      The solution is relatively simple, in concept at least. There needs to be a clear delineation between that state that needs to be shared--for efficient concurrency. And that state that mustn't be shared--for algorithmic safety. And the programmer must have explicit control over the former, whilst the compiler ensures and enforces the latter. In this way, thread procedures can be written without regard to the safety of local variables and state. Anything declared 'local', or 'non-shared', or better, any not explicitly marked as shared, is protected through compiler-enforced mechanisms from leaking between threads. At the same time, shared state can be passed around between threads as the needs of the algorithm dictate, without incurring huge penalties of copying involved in write-once variables.
      Another way of viewing the two concurrency abstractions is:
          o the former seeks to give the programmer a zillion tools for controlling access to shared state and synchronising control flows between threads.
            The problem here, beyond the well-known difficulties of getting this stuff right, is that the exposure and predominance of these mechanisms actively encourages programmers--especially those new to concurrency--to design their algorithms around utilising locks and synchronisation.
          o whilst the latter seeks to hide all such control within the abstraction beyond the programmers reach.
            With this approach, you end up with either weird and unintuitive programming constructs and paradigms; or huge, unwieldy and slow compile-time analysis engines; or belt&braces, copy-everything and lock-everything always--"whether it needs it or not"--infra-structure and library code. 
      There is a third answer to the problems of locking and synchronisation. Avoid them! Sounds too simple to be a point worth making, but it is a fact that most algorithms and applications that can benefit from concurrency can, with a little care, be architected such that they use little or no locking or synchronisation, despite that they may manipulate large volumes of shared state, in place. The trick is to program the application so that only one thread attempts to access any given piece of data an any given time. I'm not talking about using mutexs to prevent concurrent access. More, mechanisms that don't allow the programmer to try. But without throwing the (procedural) baby out with the bath water.
      This can be done today--in any language. It just requires the programmer to be aware of the techniques and apply them. But it would be a whole lot easier if programmers didn't have to become aware of the implementation details or re-invent them every time. To that end, there are several new abstractions that can help:
         1. Memory delegates- handles to (subsections of) real shared memory, that can be passed between threads, but which can only be used by the thread holding the delegate. Old copies of the delegate become disabled. Runtime enforcement is fatal. Compile-time detection easy.
         2. Thread-controlled shared variables:
            This variable is shared--by threads A & B (& F ...) ONLY!
         3. Access type controls on shared variables:
            This variable is shared between threads A & B, but only thread A can write to it and only thread B can read from it. And B cannot read from it until A has written it. And thread A cannot write to it again until thread B had read it.
         4. Bi-directional latching shared variables.
            Variable shared between threads A & B, readable and writable by both; but cannot read until A has written; and once B has read, neither can read (and A cannot write again) until B has written; now only A can read and neither can read again (and B cannot write again), until A has written.
            And so you encapsulate the request-response communications protocols in a single variable.
         5. Self limiting queues:
            Like thread controlled variables, which threads have access (and what type of access) is controlled, and both compile-time and run-time enforced. But in addition, the queues limit their own sizes such that any attempt to write when the queue is 'full' causes the writer to block. Any attempt to read when the queue is empty causes the reader to block.
            You get self-regulating synchronisation with tunable buffering.
         6. Declarative threading of functions for 'fire and forget' usage.
         7. Delayed (lazy) results from threads (promises).
            Instead of having explicitly create a thread passing the thread procedure address; retrieve the handle; and then later, explicitly call a blocking join to get the results. Instead, the above two combine to give something like:
            sub threadProc1 :threaded { ... } sub threadProc2 :threaded { ... } my $result1 = threadProc1( @args ); my $result2 = threadProc2( @otherArgs ); unless( defined $result1 && defined $result2 ) { ## continue doing other stuff while we wait } ## If the threads have completed by the time we reach this statement ## then the statements completes immediately. Otherwise it blocks unti +l ## both results are available. Effectively joining both threads. my $composite = $result1 + $result2;
            [download]
            /li>
      With a few simple control and data sharing constructs such as these, all the scary, difficult parts of threading disappear under the covers, and the programmer can get back to concentrating uponn their application whilst knowing that it will benefit from whatever core and threads are available.
      You can imagine the pipeline example (read; decode; transform; encode; write), from above being written something along the lines of;
      my $pipeline :promise = TQmap { open $out, '>', $outFile; write( $out, $_ ) while $Qin->dequeue; close $out; } 1, 10, TQmap { while( $Qin->dequeue ) { my $encoded = encode( $_ ); $Qout->enqueue( $encoded ); } } 2, 20, TQmap { while( $Qin->dequeue ) { my $transformed = transform( $_ ); $Qout->enqueue( $transformed ); } }, 4, 40, TQmap { while( $Qin->dequeue ) { my $decoded = decode( $_ ); $Qout->enqueue( $decoded ); } }, 2, 20, TQmap { ## read open my $in, '<'. $inFile; $Qout->enqueue( $_ ) while <$in>; close $in; }, 1, 10, undef; monitorKeyboard() until defined $pipeline; if( $pipeline != SUCCESS ) { log( error, $pipeline ); } else { log( success, $pipeline ); } exit;
      [download]
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."" --- http://perlmonks.org/?node_id=762207

http://www.computerworld.com.au/article/307418/-z_programming_languages_erlang :

" Some things we added to the language were with hindsight not so brilliant. I'd happily remove macros, include files, and the way we handle records. I'd also add mechanism to allow the language itself to evolve.

We have mechanisms that allow the application software to evolve, but not the language and libraries itself. We need mechanisms for revision control as part of the language itself. But I don't know how to do this. I've been thinking about this for a long time.

Instead of having external revision control systems like Git or Subversion I'd like to see revision control and re-factoring built into the language itself with fine-grain mechanism for introspection and version control. "

ampl language

scala vs java: http://www.computerworld.com.au/article/315254/-z_programming_languages_scala "The challenge was to identify constructs from one side with constructs from the other. For instance, a function value in a functional programming language corresponds to an object in an object-oriented language. Basically you could say that it is an object with an "apply" method. Consequently, we can model function values as objects.

Another example is in the algebraic data types of functional languages that can be modelled as class hierarchies on the object-oriented side. Also, the static fields and methods as found in Java. We eliminated these and modelled them with members of singleton objects instead. There are many other cases like these, where we tried to eliminate a language construct by matching and unifying it with something else. "

"What is your favourite feature of the language, if you had to pick? I don't think I can name a single favourite feature. I'd rather pick the way Scala's features play together. For instance, how higher-order functions blend with objects and abstract types, or how actors were made possible because functions in Scala can be subclassed. The most interesting design patterns in Scala come precisely from the interaction between object-oriented and functional programming ideas. "

wikipedia's eiffel page explains features analogous to haskell's type parameters (generics) and contexts (constraints that certain type parameters be in certain type classes) in a simple way:

"Names can, of course, be reused in different classes. For example the "+" operator is defined in several classes: INTEGER, REAL, STRING, etc. [edit] Genericity

Classes can be generic, to express that they are parameterized by types. Generic parameters appear in square brackets:

class LIST [G] ...

G is known as a "formal generic parameter". (Eiffel reserves "argument" for routines, and uses "parameter" only for generic classes.) With such a declaration G represents within the class an arbitrary type; so a function can return a value of type G, and a routine can take an argument of that type:

item: G do ... end put (x: G) do ... end

The LIST [INTEGER] and LIST [WORD] are "generic derivations" of this class. Permitted combinations (with n: INTEGER, w: WORD, il: LIST [INTEGER], wl: LIST [WORD]) are

n := il.item wl.put (w)

INTEGER and WORD are the "actual generic parameters" in these generic derivations.

It is also possible to have 'constrained' formal parameters, for which the actual parameter must inherit from a given class, the "constraint". For example in

   class HASH_TABLE [G, KEY -> HASHABLE]

a derivation HASH_TABLE [INTEGER, STRING] is valid only if STRING inherits from HASHABLE (as it indeed does in typical Eiffel libraries). Within the class, having KEY constrained by HASHABLE means that for x: KEY it is possible to apply to x all the features of HASHABLE, as in x.hash_code. "

"infecting nondeterminism" like brace expansion in shells but with infection; in shell, a{d,c,b}e -> ade ace abe;

;; for separator

make sure you can do this concisely: http://en.wikipedia.org/wiki/Delegation_pattern

toread: http://www.springerlink.com/content/hwua58hp0164kbep/ "A Marriage of Class- and Object-Based Inheritance Without Unwanted Children " http://www.springerlink.com/content/vj48mc681h0ctvel/ "Intersecting Classes and Prototypes"

"late binding of self": in OOP, a primary difference between true delegation and mere consultation is that, if the delegee code refers to "self", that refers to the delegator. see figure 1 in "Intersecting Classes and Prototypes"

In delegation, when the delegator object is called, the delegator calls the delegee code. Late binding of self means that if the delegee code refers to "self", then that refers to the delegator object, not to the delegatee object.

mb UML is useful for graph data patterns: http://en.wikipedia.org/wiki/File:Aggregation-Composition3.png

"graph-oriented programming" http://www.springerlink.com/content/v16l843781g54283/ GOP: A Graph-Oriented Programming Model for Parallel and Distributed Systems http://docs.jboss.org/jbpm/v3/userguide/graphorientedprogramming.html

call data interfaces "views"? is this similar to "roles" http://en.wikipedia.org/wiki/Role-Oriented_Programming ?

roles: read Kasper B. Graversen's thesis: http://www.itu.dk/people/kbilsted/graversen06thesis.pdf

there's values and references. and, there's literals and variables. these are analogous entities, but the former two are types of values that can go into variables, whereas the latter two are types of syntactical entites that can go into code (values that can go into ASTs). also, when a variable is encountered in code, the default is to replace it with its value (to override this, u must "quote"), whereas when a reference is encountered in a variable, the default can be that (like in Python), or it can be to leave it as a reference (like in C). The Python method seems clearer b/c you don't have to deal with the distinction b/t vals and refs inside variables. meta project connection?

inversion of control: when the application is called as something like an "event handler" for a framework (rather than the "typical" situation, where the application is in charge of the flow of control)

dependency injection: in OOP: ok, so you have an object with a field that contains a reference to another object. sometimes the first object calls a method on the second object. ok, great, so the implementation of the two objects is decoupled. But at some point, you're going to have to instantiate the first object and the second object, and pass the first object a reference to the second one. this is a special case of inversion of control because the first object is getting called by someone else.

some ways to do this:

service locator: an object that you can query to find other services

i guess a virtue of setter or interface injection is that the framework can swap out services at mid-runtime by calling the component again (if this is allowed..). similarly, a virtue of service locator is that the component can call the framework again later. Avalon's approach (interface injection of a service locator) gives both of these. otoh the service locator should also be provided in the constructor so there isn't two setup phases.

Fowler's article: http://www.martinfowler.com/articles/injection.html notes: in his PicoContainer? constructor injection example, in the configureContainer routine, you didn't have to explicitly register the link from MovieLister? to MovieFinder?; PicoContainer? must have introspected and saw that MovieLister?'s constructor takes a param of type MovieFinder?. the choice of which MovieFinder? implementation to use for MovieFinders? was explicitly made; so PicoContainer? must have used that as the arg for MovieLister?'s constructor. By contrast, in his Spring setter injection example, the fact that MovieLister? needed a MovieFinder? did have to be explicitly declared. Clearly, I prefer to implicit introsective way.

  J's dynamic component system/implementation choosing system will have to overcome issues like this: "If you have multiple ways to construct a valid object, it can be hard to show this through constructors, since constructors can only vary on the number and type of parameters. This is when Factory Methods come into play, these can use a combination of private constructors and setters to implement their work. The problem with classic Factory Methods for components assembly is that they are usually seen as static methods, and you can't have those on interfaces. You can make a factory class, but then that just becomes another service instance. A factory service is often a good tactic, but you still have to instantiate the factory using one of the techniques here."

Fowler opines that "Of course the testing problem is exacerbated by component environments that are very intrusive, such as Java's EJB framework. My view is that these kinds of frameworks should minimize their impact upon application code, and particularly should not do things that slow down the edit-execute cycle. "

aggregation: like consultation, but the object being consulted is just some other object, which persists before and/or after the creation/destruction of the consulting object (i.e. it's not just some private object owned by the consulting object and destroyed when the consulting object is destroyed)

how does Haskell do component programming without typecasts? see: Haskell COM interface at http://www.haskell.org/hdirect/design-7.html. But that's an FFI.

toread: http://portal.acm.org/citation.cfm?id=1294920 A type-level approach to component prototyping. haskell, and also has a useful-looking basis set of functionality for component programming.

http://www.md.chalmers.se/Cs/Research/Functional/Fudgets/Intro/ Haskell GUI approach

A computational model of classical linear logic: Is classical linear logic inherently parallel? (1997)

Release 0.6 of Monadic Constraint Programming We've just released version 0.6 of the Monadic Constraint Programming framework on Hackage.

This release provides a whole lot of generic support for Finite Domain (FD) constraint solvers: a common modeling language and infrastructure for targeting different backends. Very useful if you happen to develop an FD solver and want to hook it up to our framework to benefit from its advanced search capabilities.

Users will of course be much more interested in the actual backends that we provide. Besides the basic Haskell FD solver we had before, there are now three different ways of interfacing Gecode, one of the best FD solvers out there and open source too. To get started, the examples directory shows how to model a number of well-known problems. Posted by Tom Schrijvers at 3:59 PM 0 comments Links to this post Labels: constraints, Gecode, Haskell

Monday, September 7, 2009 EffectiveAdvice?: AOP, mixin inheritance, monads, parametricity, non-interference, ... How to reason about effectful advice? Write your AOP programs in Haskell, the world's best imperative programming language. Use monads and monad transformers for effects and functional mixins for advice. In return you get powerful reasoning tools: equational reasoning and parametricity.

Read the technical report. The appendix has some pretty cool parametricity proofs on non-interference of advice, based on Janis' Voigtlaender's ICFP'09 paper "Free Theorems Involving Type Constructor Classes".

    EffectiveAdvice: Overview, background and proofs
    Bruno Oliveira, Tom Schrijvers and William Cook
    Abstract
    Advice is a mechanism, widely used in aspect-oriented languages, that allows one program component to augment or modify the behavior of other components. Advice is useful for modularizing concerns, including logging, error handling, and some optimizations, that would otherwise require code to be scattered throughout a system. When advice and other components are composed together they become tightly coupled, sharing both control and data flows. However this creates important problems: modular reasoning about a component becomes very difficult; and two tightly coupled components may interfere with the control and data flows of each other.
    This paper presents EffectiveAdvice, a disciplined model of advice, inspired by Aldrich's Open Modules, that has full support for effects in both base components and advice. With EffectiveAdvice, equivalence of advice, as well as base components, can be checked by equational reasoning. The paper describes an implementation of EffectiveAdvice as a Haskell library and shows how to use it to solve well-known programming problems. Advice is modeled by mixin inheritance and effects are modeled by monads. Interference patterns previously identified in the literature are expressed as combinators. Parametricity, together with the combinators, is used to prove two harmless advice theorems. The result is an effective model of advice that supports effects in both advice and base components, and allows these effects to be separated with strong non-interference guarantees, or merged as needed.

Posted by Tom Schrijvers at 9:16 PM 0 comments Links to this post Labels: AOP, Haskell

http://tomschrijvers.blogspot.com/2009/05/dictionaries-eager-or-lazy-type-class.html

http://tomschrijvers.blogspot.com/2009_01_01_archive.html

http://en.wikipedia.org/wiki/Hollywood_Principle

http://en.wikipedia.org/wiki/Object_composition

mock object testing: http://martinfowler.com/articles/mocksArentStubs.html create a fake version of some object, pass it to the rest of the program, and then put code in it to verify that it is being called correctly (method calls in the right order, with the right arguments). this provides "behavior verification", i.e. a test that the program is interacting with the mocked obj in the proper way, rather than just "state verification", i.e. a test that that object ends up in the proper state. for jasper: how can a mock obj be constructed in a statically typed language? perhaps with multi-stage metaprogramming?

note: even though we only have a few types of matched delimiters on the keyboard ( (), {}, [], <> ), you can combine ordinary parens with "modifiers" to make as many semantic types of matched delimiters as you want (i.e. m() , m()m, &( )&, etc). so it's just a matter of what's easier to read and what's easier to type.

should something approximating the capabilities of static methods and static fields in Java be in Jasper? yes, this is a global, but how to do dynamic service locators otherwise?

j: thread-local variables?

j: erlang's reliability, hot-code swapping (should factor in with making the implementation choice dynamic, not just at compile time -- swapping in a new implementation should only break things using those (parts?) of its interfaces which have been removed or whose behavior becomes incompatible)

thinking about how to do a dynamic service locator w/o typecasting. this won't work but: one could have a hetero array that stores tuples (value, type), where the type has to be the type of that value. now you can look at the type of the tuple and that tells you what to cast it to. but this doesn't help b/c the client already knows what it wants; and throwing an error if the 'type' member of the tuple is wrong isn't different from just asking the value what type it is via introspection, and throwing an error if its wrong (or, otherwise, an error will be thrown at cast time if the cast is wrong).

"inheritance anomaly": http://citeseer.ist.psu.edu/old/matsuoka93analysis.html Analysis of Inheritance Anomaly in Object-Oriented Concurrent Programming Languages (1993) a list of a bunch of situations in which synchronization code cannot be inherited


http://en.wikipedia.org/wiki/Category:Concurrent_programming_languages

http://en.wikipedia.org/wiki/Join-calculus_%28programming_language%29

www.informatics.sussex.ac.uk/users/vs/research/paps/anomalySurvey.pdf The inheritance anomaly: ten years after. notes: ok paper. contains survey of some recent sync techniques, and notes about the types of situations in which synchronization code prevents things from being inherited in these languages. conclusion is that by separating sync code from other code, only the sync code needs to be changed. references AOP. implicitly suggests that a programming language should include a number of baked-in "aspects", mutex specs being one of these. what should the others be? logging, per-user/group access control, error handling, what else?


more concurrency models

petri nets

meta-models: bisimulation (equivalence relation on automata)

process logicy primitives: a

b -- can be exec simult, or in any order, a+b -- opponent can choose which to execute, only one is executed, ab -- first a, then b executed in sequence, a CONC b -- type of concurrency is input-dependent. what else? some ideas: a SYNC b -- lower-level sychronization guarantees (todo), a ++ b -- opponent chooses a+b or ab, a ? b -- opponent chooses a ++ b or to do neither of them

vaughan pratt's "Transition and cancellation in concurrency and branching time" boole.stanford.edu/pub/seqconc.pdf

ideas to use graphs in modelling concurrency

partial ordering (dependencies) of operations

graph of computers

graph of processes on computers

temporal unrolling of the previous 2

how to represent relational data in graphs? look at relational bayes nets?

doug's suggestion: decompose as parallel, not concurrent (no inter-process communication except between parent and child at spawn and result time)

arrow's thm for distributed systems? either you have to give up composability, or irreversibility, or what else? lock-free? but don't we have a thm that wait-free is always possible, given enuf memory? mb give up not [memory that grows linearly w/ # of processes]?

wait free, lock free

eiffel's SCOOP extension:

http://scoop.origo.ethz.ch/wiki/Tutorial

" ... All calls to operations on a particular object are handled by a single processor; we say that the processor handles the object. With these basic concepts, we may isolate the difference between sequential and concurrent computation down to a single key point: what happens in the basic operation of O-O computation, a "feature call" of the form

x.f (a)

In a sequential context this is synchronous: computation doesn't proceed until the call to f has been completed. In a concurrent context, if x denotes an object handled by a different processor (different from the processor handling the object on whose behalf the call is being executed), the communication can be asynchronous: computation can just proceed without waiting for f to terminate. That's indeed the whole idea of concurrency: several computations can proceed in parallel, not waiting for each other until they need to. When they indeed "need to" is, in SCOOP, determined not by the programmer but automatically by the SCOOP mechanisms: the processor of the client object will need to resynchronize with the processor in charge of x when its computation requires access to a query on x. This SCOOP policy is called wait by necessity.

To distinguish between synchronous and asynchronous calls, the program must specify whether the processor handling x is the same or another. This leads to the single language extension required by SCOOP: separate declarations. If x represents an object handled by another processor, it will be declared (in Eiffel syntax) not as

x: SOME_TYPE

but as

x: separate SOME_TYPE

This doesn't specify the processor but does specify that it is (or may be) a different one, yielding a different semantics of calls.

For simple reasons of being able to reason about programs, calls on a separate object are exclusive: only one client can use a separate supplier at a time. The mechanism for reserving an object is simply argument passing: a call of the form

     g (x)

or

     b.g (x)

where x is separate, will only proceed when the object attached to x becomes available to the caller; it will then retain that object for the duration of the call. Calls of the above basic form x.f (a), where x is separate, are only permitted when x is such a formal argument of the enclosing routine, here g. This rule guarantees predictability of the code and avoids major mistakes; even for an experienced concurrent programmer, it is very easy - in a context where the rule would not apply - to believe instinctively that in

x.insert (a, some_position) ... y = x.value (some_position)

the element retrieved by the last instruction is the one inserted by the first instruction. But some other separate client may have polluted the structure by squeezing in another insert instruction in-between, even though this is not reflected in the code. Such bugs are very difficult to identify because they are by their very nature transient - the problem will occur only rarely, and in appearance haphazardly. The SCOOP rules guarantee that the above calls may only occur in a routine of which x is a separate argument. So the intuitive expectation that the two calls act on the same object with no competing access in-between - as suggested by the code - indeed matches reality. If this property is not required, the calls to insert and value may just appear in different routines of the class, for a finer level of access control granularity.

The final synchronization mechanism is provided by a natural extension of the Design by Contract constructs of Eiffel. A precondition on a separate target, as in

insert (structure: CONTAINER; element: SOME_TYPE; position: INTEGER) require structure_not_void: structure =/ void structure_not_full: not structure.is_full element_not_void: element /= void valid_position: structure.is_valid_index (position) do ... ensure ... end

cannot keep its usual semantics of a correctness condition, because even if the client ensures it prior to a call some other client can invalidate it before the routine actually starts its execution, so that the routine would be wrong in assuming the precondition. It has to be reinterpreted as a wait condition. Hence a call such as

insert (s, e, p)

will proceed only when s is (as noted before) available on an exclusive basis to the client, and satisfy the precondition. This convention provides a simple and powerful synchronization technique.

These are the basic concepts of SCOOP. They are complemented by a few library mechanisms that tune the mechanism, for example to specify limits on the acceptable waiting time when trying to reserve an object. ... "

in the imperative world (or in jasper's imperative situations), where state can change in mid-procedure, you have not just timeless assertions, but pre-conditions (true at beginning), post-conditions (true at end), other conditions (true at interior points of time), and scoped invariants (true throughout scope). the former three can be captured by seq'd assertions, the first two can be implicitly seq'd by putting assertions at beginning or at end. check eiffel to see if there are other sorts of assertions

OOP system mixing haskell and CLOS: http://lambda-the-ultimate.org/node/3569

"attribute data structures with byte layout details for the compiler to follow, and you can't save that information as publicly accessible assembly details, which .NET also allows." -- http://lambda-the-ultimate.org/node/3569#comment-50883

"Your description of 'Behavior' sounds to me like a form of structural typing. Likewise 'Taxonomy' sounded very similar to nominal typing. The 'Implementation' category sounds maybe like a more powerful version of type qualifiers (final static etc..). " -- http://lambda-the-ultimate.org/node/3569#comment-50616

scoping: transactions, scoped invariants. separate syntax, or force subroutine?

"As an aside, I once got into a convo with one of the most vocal advocates of OSGi. He said he'd "like to see the standard pushed on other platforms like C++ and .NET." I asked him what he meant by that and whether .NET already fulfilled most of OSGi with its core Ecma .NET attributes such as AssemblyInfoAttribute?." -- http://lambda-the-ultimate.org/node/3569

nominative tying and structural typing -- structural typing is like graph shape matching in jasper, and nominative typing is like phantom types or types to denote units -- how to combine? sounds like the obvious thing to do is struct typing with constants in the pattern to be matched serving as "type tags" when nominative typing is desired, i.e. "inches" is declared as a pattern with two children, one a number (edge label 0), and one a metaattribute with edge label "unit" and the constant value "inch". hmmm, or mb better to take numbers themselves, clone them, and add a metaattrib named "unit", rather than adding an additional level of indirection by referencing the underlying number. or maybe combining these by having the metaattribute notation transparently construct and access a wrapper type which stores the underlying value at edge label 0.

might call the composable types used in normal Jasper code "features" b/c ppl associate "type" with what in jasper is the fully specifed type of a variable.


decidable dependent types stuff:

Dependent types are based on the idea of using scalars or values to more precisely describe the type of some other value. For example, "matrix(3,3)" might be the type of a 3×3 matrix. We can then define typing rules such as the following rule for matrix multiplication:

    matrix_multiply : matrix(k,m) × matrix(m,n) ’ matrix(k,n)

where k, m, n are arbitrary positive integer values. A variant of ML called Dependent ML has been created based on this type system, but because type-checking conventional dependent types is undecidable, not all programs using them can be type-checked without some kind of limitations. Dependent ML limits the sort of equality it can decide to Presburger arithmetic; other languages such as Epigram make the value of all expressions in the language decidable so that type checking can be decidable, ...

http://www.cs.nott.ac.uk/~txa/publ/ott-conf.pdf Observational Type Theory (epigram 2)

http://en.wikipedia.org/wiki/Intuitionistic_type_theory something to do with Epigram

http://en.wikipedia.org/wiki/ATS_%28programming_language%29 supercedes Dependent ML

  ATS also has the good idea of token prefixes which indicate boxed or unboxed types

---

linear types uniqueness types

http://en.wikipedia.org/wiki/Charity_%28programming_language%29 more than primitive recursive, but not turing complete; category-theory-esque

http://en.wikipedia.org/wiki/Concurrent_constraint_logic_programming


phantom types: types without any values; used to embed type systems into haskell

manifest typing: when you have to declare types. vs. dynamic typing, and vs. type inference.



http://c2.com/cgi/wiki?ComputationalPrimitives : " , but here's what I think of when I hear of computation primitives:

Imperative language primitives:

object-oriented language primitives:

LogicProgrammingLanguage? primitives:

functional language primitives:

relational language primitives:

Digital logic primitives:

" hua Bloch's 2006 JavaOne? presentation:

 public interface Shop<T> {
	T buy();
	void sell(T item);
	void sell(Collection<? extends T> lot);
	void buy(int numItems, Collection<? super T> myStuff);
 }
 // You can buy a bunch of models from the train shop
 modelTrainShop.buy(5, myModels);
 // You can sell your train set to the model shop
 modelShop.sell(myTrains);

Basic rule:


i think covariance is for code reuse, contravariance for substitution. a similar view:

" Covariance and contravariance : conflict without a cause Castagna, Giuseppe ACM Transactions on Programming Languages and Systems Vol.17, No. 3 (May 1995), pp. 431-447

From the Abstract: (http://citeseer.ist.psu.edu/castagna94covariance.html) (Full text is on http://www.cs.trinity.edu/~mlewis/CSCI3294-F01/Papers/p431-castagna.pdf.)

In type-theoretic research on object-oriented programming, the issue of "covariance versus contravariance" is a topic of continuing debate. In this short note we argue that covariance and contravariance appropriately characterize two distinct and independent mechanisms. The so-called contravariance rule correctly captures the subtyping relation (that relation which establishes which sets of functions can replace another given set in every context). A covariant relation, instead, characterizes the specialization of code (i.e., the definition of new code which replaces old definitions in some particular cases). Therefore, covariance and contravariance are not opposing views, but distinct concepts that each have their place in object-oriented systems. Both can (and should) be integrated in a type-safe manner in object-oriented languages.

We also show that the independence of the two mechanisms is not characteristic of a particular model but is valid in general, since covariant specialization is present in record-based models, although it is hidden by a deficiency of all existing calculi that realize this model. As an aside, we show that the lambda-calculus can be taken as the basic calculus for both an overloading-based and a record-based model. Using this approach, one not only obtains a more uniform vision of object-oriented type theories, but in the case of the record-based approach, one also gains multiple dispatching, a feature that existing record-based models do not capture.

The resulting type system is similar to that of CecilLanguage?. "


defn of contra vs. covariance:

" Say you have a class Foo, which has a method bar(). Method bar() takes an argument of type middle_type, and returns a value of type middle_type.

Now you make a subclass of Foo called SubFoo?, and you override bar(). What types can the new bar() take? What types can it return?

Look at return types first: we want to be able to substitute SubFoo? where existing code expects Foo, so it needs to return things of type middle_type, or of some subtype (e.g. sub_type). This should be pretty obvious.

As for what types our new bar() can take: One answer is: bar() can take only things of type middle_type. You can't declare it to take sub_type, and you can't declare it to take super_type. This is called invariance.

Another answer: bar() can only be declared to take things that are a subtype of middle_type - so middle_type is OK, and sub_type is OK, but super_type is out. This is called covariance.

Finally, the third answer: bar() can only be declared to take things that are a supertype of middle_type - so any of middle_type, and super_type may be passed to bar(), but sub-type is not allowed. This is called contravariance.

Covariance seems to jibe with our notion that subclasses are more specialized, less general than their superclasses. So you might have a Collection class which takes and returns Objects; you could subclass it to make a FooCollection? class which takes and returns Foos (where Foo is a subclass of Object).

Contravariance sounds kind of counterintuitive at first, but it's actually just analogous to the famous advice about implementing protocols: "Be liberal in what you accept, and conservative in what you send." So just as you have to return a subtype of the original bar()'s return type, you have to accept any supertype of whatever the original bar() accepts.

If your type system enforces contravariance of parameters, then it can tell at compile time whether your code is typesafe (cf. SatherLanguage?, ObjectiveCaml?). If it enforces covariance (Cf. cf. EiffelLanguage?, CeePlusPlus?), it can't really do that, but it can make some good guesses (cf. EiffelLanguage?) - though Eiffel are known to crash when it guesses wrong. " -- http://c2.com/cgi/wiki?ContraVsCoVariance


http://www.modernperlbooks.com/mt/2009/04/the-why-of-perl-roles.html http://www.modernperlbooks.com/mt/2009/05/perl-roles-versus-inheritance.html http://www.modernperlbooks.com/mt/2009/05/perl-roles-versus-duck-typing.html http://www.modernperlbooks.com/mt/2009/05/perl-roles-versus-interfaces-and-abcs.html http://www.modernperlbooks.com/mt/2009/05/more-roles-versus-duck-typing.html


" A trait is different from a mixin in that its individual methods can be ma- nipulated with trait operators such as sum (merge the methods of two traits), exclude (remove a method from a trait), and alias (add a copy of a method with a new name; do not redirect any calls to the old name). The practical difference between mixins and " -- http://www.cs.utah.edu/plt/publications/aplas06-fff.pdf Scheme with Classes, Mixins, and Traits

mb mixins dont have state either? "The classes BaseWidget?, Widget and Label have state and they take the role of base classes, not mixins." -- http://www.artima.com/weblogs/viewpost.jsp?thread=246341

assertion that Component Architecture means using composition instead of inheritance "Actually, Zope 3 has been entirely rewritten with the goal of avoiding the mixin abuse of Zope 2 and to use composition instead of inheritance (this is basically what the buzzwords Component Architecture really mean)." -- http://www.artima.com/weblogs/viewpost.jsp?thread=246341

http://www.artima.com/weblogs/viewpost.jsp?thread=246341 asserts that mixins lead to classes with zillions of methods, leading to method name collisions:

" For instance, have a look at the hierarchy of the Plone Site class which I report in appendix. Between square backets you can see the number of methods/attributes defined per class, except special attributes. The plot comes from a real Plone application I have in production. The total count is of 38 classes, 88 names overridden, 42 special names and 648 regular names: a monster.

To trace the origin of the methods and to keep in mind the hierarchy is practically impossibile. Moreover, both autocompletion and the builtin help facility become unusable, the self-generated class documentation become unreadable since too big.

In other words, a design based on mixins works for small frameworks, but it does not scale at all to large frameworks. Actually, Zope 3 has been entirely rewritten with the goal of avoiding the mixin abuse of Zope 2 and to use composition instead of inheritance (this is basically what the buzzwords Component Architecture really mean).

My hate for mixins comes from my experience with Zope/Plone. However the same abuses could be equally be done in other languages and object systems - with the notable exception of CLOS, where methods are defined outside classes and therefore the problem of class namespace pollution does not exist - in the presence of huge frameworks.

A consequence of namespace pollution is that it is very easy to have name clashes. Since there are hundreds of methods and it is impossible to know all of them, and since method overriding is silent, this is a real problem: the very first time I subclassed a Plone class I run into this issue: I overrode a pre-defined method inadvertently, by causing hard to investigate problems in an unrelated part of the code. "

suggests using generic fns instead, and modules for namespace control:

"I am a big fan of generic functions which are already used in the Python word - print is a generic function, the comparison operators are generic functions, numpy universal functions (ufunctions) are generic functions, etc - but should be used even more. With generic functions, mixins becomes useless. A side effect is that the class namespace becomes much slimmer: for instance, in CLOS classes are used just to contain state, whereas the methods live in a separate namespace. In most languages instead, classes are used as a namespace control mechanism, performing double duty - namespace control should be the job of modules." http://www.artima.com/weblogs/viewpost.jsp?thread=246341


python's super

" There is no superclass in a MI world

Readers familiar will single inheritance languages, such as Java or Smalltalk, will have a clear concept of superclass in mind. This concept, however, has no useful meaning in Python or in other multiple inheritance languages. I became convinced of this fact after a discussion with Bjorn Pettersen and Alex Martelli on comp.lang.python in May 2003 (at that time I was mistakenly thinking that one could define a superclass concept in Python). Consider this example from that discussion:

           +-----+
           |  T  |
           |a = 0|
           +-----+
         /         \
        /           \
    +-------+    +-------+
    |   A   |    |   B   |
    |       |    | a = 2 |
    +-------+    +-------+
        \           /
         \         /
           +-----+
           |  C  |
           +-----+
              :
              :    instantiation
              c
    >>> class T(object):
    ...     a = 0
    >>> class A(T):
    ...     pass
    >>> class B(T):
    ...     a = 2
    >>> class C(A,B):
    ...     pass
    >>> c = C()

What is the superclass of C? There are two direct superclasses (i.e. bases) of C: A and B. A comes before B, so one would naturally think that the superclass of C is A. However, A inherits its attribute a from T with value a=0: if super(C,c) was returning the superclass of C, then super(C,c).a would return 0. This is NOT what happens. Instead, super(C,c).a walks trought the method resolution order of the class of c (i.e. C) and retrieves the attribute from the first class above C which defines it. In this example the MRO of C is [C, A, B, T, object], so B is the first class above C which defines a and super(C,c).a correctly returns the value 2, not 0:

    >>> super(C,c).a
    2

You may call A the superclass of C, but this is not a useful concept since the methods are resolved by looking at the classes in the MRO of C, and not by looking at the classes in the MRO of A (which in this case is [A,T, object] and does not contain B). The whole MRO is needed, not just the first superclass.

So, using the word superclass in the standard docs is misleading and should be avoided altogether. Bound and unbound (super) methods

Having established that super cannot return the mythical superclass, we may ask ourselves what the hell it is returning ;) The truth is that super returns proxy objects. " --- http://www.artima.com/weblogs/viewpost.jsp?thread=236275

involving the MRO (method resolution order) leads to all sorts of tricky results:

"

Remember to use super consistently

Some years ago James Knight wrote an essay titled Super considered harmful where he points out a few shortcomings of super and he makes an important recommendation: use super consistently, and document that you use it, as it is part of the external interface for your class, like it or not. The issue is that a developer inheriting from a hierarchy written by somebody else has to know if the hierarchy uses super internally or not. For instance, consider this case, where the library author has used super internally:

  1. library_using_super

class A(object): def __init__(self): print "A", super(A, self).__init__()

class B(object): def __init__(self): print "B", super(B, self).__init__()

If the application programmer knows that the library uses super internally, she will use super and everything will work just fine; but it she does not know if the library uses super she may be tempted to call A.__init__ and B.__init__ directly, but this will end up in having B.__init__ called twice!

    >>> from library_using_super import A, B
    >>> class C(A, B):
    ...     def __init__(self):
    ...         print "C",
    ...         A.__init__(self)
    ...         B.__init__(self)
    >>> c = C()
    C A B B

On the other hand, if the library does not uses super internally,

  1. library_not_using_super

class A(object): def __init__(self): print "A",

class B(object): def __init__(self): print "B",

the application programmer cannot use super either, otherwise B.__init__ will not be called:

    >>> from library_not_using_super import A, B
    >>> class C(A,B):
    ...     def __init__(self):
    ...         print "C",
    ...         super(C, self).__init__()
    >>> c = C()
    C A

So, if you use classes coming from a library in a multiple inheritance situation, you must know if the classes were intended to be cooperative (using super) or not. Library author should always document their usage of super. Argument passing in cooperative methods can fool you

James Knight devolves a paragraph to the discussion of argument passing in cooperative methods. Basically, if you want to be safe, all your cooperative methods should have a compatible signature. There are various ways of getting a compatible signature, for instance you could accept everything (i.e. your cooperative methods could have signature *args, kw) which is a bit too much for me, or all of your methods could have exactly the same arguments. The issue comes when you have default arguments, since your MRO can change if you change your hierarchy, and argument passing may break down. Here is an example:

"An example of argument passing in cooperative methods"

class A(object): def __init__(self): print 'A'

class B(A): def __init__(self, a=None): print 'B with a=%s' % a super(B, self).__init__(a)

class C(A): def __init__(self, a): print 'C with a=%s' % a super(C, self).__init__()

class D(B, C): def __init__(self): print 'D' super(D, self).__init__()

>>> from cooperation_ex import D >>> d = D() D B with a=None C with a=None A

This works, but it is fragile (you see what will happen if you change D(B, C) with D(C, B)?) and in general it is always difficult to figure out which arguments will be passed to each method and in which order so it is best just to use the same arguments everywhere (or not to use cooperative methods altogether, if you have no need for cooperation). There is no shortage of examples of trickiness in multiple inheritance hierarchies; for instance I remember a post from comp.lang.python about the fragility of super when changing the base class.

Also, beware of situations in which you have some old style classes mixing with new style classes: the result may depend on the order of the base classes (see examples 2-2b and 2-3b in Super considered harmful). " -- http://www.artima.com/weblogs/viewpost.jsp?thread=237121

Simionato concludes that the problem is with multiple inheritance itself; he prefers generic multimethods, traits, or mixins.

he defines mixins and traits as so:

" I personally liked super, cooperative methods and multiple inheritance for a couple of years, then I started working with Zope and my mind changed completely. Zope 2 did not use super at all but is a mess anyway, so the problem is multiple inheritance itself. Inheritance makes your code heavily coupled and difficult to follow (spaghetti inheritance). I have not found a real life problem yet that I could not solve with single inheritance + composition/delegation in a better and more maintainable way than using multiple inheritance. Nowadays I am very careful when using multiple inheritance.

People should be educated about the issues; moreover people should be aware that there are alternative to multiple inheritance in other languages. For instance Ruby uses mixins (they are a restricted multiple inheritance without cooperative methods and with a well defined superclass, but they do not solve the issue of name conflicts and the issue with the ordering of the mixin classes); recently some people proposed the concepts of traits (restricted mixin where name conflicts must be solved explicitely and the ordering of the mixins does not matter) which is interesting.

In CLOS multiple inheritance works better since (multi-)methods are defined outside classes and call-next-method is well integrated in the language; it is simpler to track down the ancestors of a single method than to wonder about the full class hierarchy. The language SML (which nobody except academics use, but would deserve better recognition) goes boldly in the direction of favoring composition over inheritance and uses functors to this aim.

" http://www.artima.com/weblogs/viewpost.jsp?thread=237121

here's a post in which he argues for plain multiple-dispatch functions over mixins: http://www.artima.com/weblogs/viewpost.jsp?thread=237764

altho i note: then you have to add multimethod cases to the functions when you define a new class that would otherwise have used a different mixin. if you don't own the code for the multimethod fn, you're in trouble (unless it can be overridden remotely for a given type)


http://www.muthukadan.net/docs/zca.html


according to http://www.holub.com/goodies/uml/:

" Aggregation (comprises) relationship relationship.1 Destroying the "whole" does not destroy the parts.

Composition (has) relationship.1 The parts are destroyed along with the "whole." "

and

" (1) Composition vs. Aggregation: Neither "aggregation" nor "composition" really have direct analogs in many languages (Java, for example).

An "aggregate" represents a whole that comprises various parts; so, a Committee is an aggregate of its Members. A Meeting is an aggregate of an Agenda, a Room, and the Attendees. At implementation time, this relationship is not containment. (A meeting does not contain a room.) Similaraly, the parts of the aggregate might be doing other things elsewhere in the program, so they might be refereced by several objects. In other words, There's no implementation-level difference between aggregation and a simple "uses" relationship (an "association" line with no diamonds on it at all). In both cases an object has references to other objects. Though there's no implementation difference, it's definitely worth capturing the relationship in the UML, both because it helps you understand the domain model better, and because there are subtle implementation issues. I might allow tighter coupling relationships in an aggregation than I would with a simple "uses," for example.

Composition involves even tighter coupling than aggregation, and definitely involves containment. The basic requirement is that, if a class of objects (call it a "container") is composed of other objects (call them the "elements"), then the elements will come into existence and also be destroyed as a side effect of creating or destroying the container. It would be rare for a element not to be declared as private. An example might be an Customer's name and address. A Customer without a name or address is a worthless thing. By the same token, when the Customer is destroyed, there's no point in keeping the name and address around. (Compare this situation with aggregation, where destroying the Committee should not cause the members to be destroyed---they may be members of other Committees).

In terms of implementation, the elements in a composition relationship are typically created by the constructor or an initializer in a field declaration, but Java doesn't have a destructor, so there's no way to guarantee that the elements are destroyed along with the container. In C++, the element would be an object (not a reference or pointer) that's declared as a field in another object, so creation and destruction of the element would be automatic. Java has no such mechanism. It's nonetheless important to specify a containment relationship in the UML, because this relationship tells the implementation/testing folks that your intent is for the element to become garbage collectable (i.e. there should be no references to it) when the container is destroyed. "


"synchronization": "As we shall see, the purpose of a synchronization algorithm is to en- force precedence relations among operation executions." -- Arbiter-Free Synchronization Distributed Computing 16, 2/3, (2003) 219-237.


http://scienceblogs.com/goodmath/2006/10/haskell_and_scheme_which_one_a.php ---

in go, a method can satisfy an interface regardless of whether it modifies the input args?:

www.stanford.edu/class/ee380/Abstracts/100428-pike-stanford.pdf " type Stringer interface { String() string } func print(args ...Stringer) { for i, s := range args { if i > 0 { os.Stdout.WriteString?(" ") } os.Stdout.WriteString?(s.String()) } } print(Day(1), Fahrenheit(72.29)) => Monday 72.3°F Again, these methods do not take a pointer, although another type might define a String() method that does, and it too would satisfy Stringer. "

---

http://en.wikipedia.org/wiki/Cyclone_programming_language

--- " Not sure if this

Not sure if this is the type of criticism you are looking for (these are mostly engineering type problems) but the reasons I still program in non-functional languages are:

Lack of functional polymorphism leads to namespace crowding, i.e. no operator overloading. Type classes help address this somewhat, but it is still a pain point.

Lack of sugar. Despite having clearly inferior support for map/filter/reduce, most scripting languages provide cleaner string processing functionality because of OO-enabled operator overloading and plenty of sugar.

State. Again is an obvious one, but the lack of state is also a plus for pure functional languages. It is a trade-off, and is one of the main features that makes pure functional languages unique. For me this is clearly a feature and not a bug. I do think though, that many of the common problems caused by the lack of state could be addressed with a good helping of sugar on the part of the compiler. " -- http://lambda-the-ultimate.org/node/3924#comment-58889

http://www.haskell.org/haskellwiki/DDC

toread: http://lambda-the-ultimate.org/node/2700 http://conal.net/papers/icfp97/ http://lambda-the-ultimate.org/node/3924#comment-59037 http://lambda-the-ultimate.org/node/3924

---

Add me to the long list of functional programming loyalists who really hopes that the monad complexity issues is resolved or mitigated. Having five million slight variation of the state monad is a black mark on Haskell and the antithesis of clean, generic, extensible design.

Still love it anyways.

Posted by: Matt Skalecki

-- http://scienceblogs.com/goodmath/2009/11/philosophizing_about_programmi.php#comment-2063737 ---
November 10, 2009 4:31 PM

In the .NET world, "command/query separation" seems to be among the major buzzwords. Every method should either return information about the current state (query) or change that state (command); queries should never change the objects they're querying.

-- http://scienceblogs.com/goodmath/2009/11/philosophizing_about_programmi.php#comment-2063771

--- Which is a great principle to apply to one's programming, but it continues to piss me off that slavish adherence to c/q separation makes it impossible to write a proper "pop" method for a stack. --- http://scienceblogs.com/goodmath/2009/11/philosophizing_about_programmi.php#comment-2063832

--- should try to understand this someday: http://scienceblogs.com/goodmath/2009/11/philosophizing_about_programmi.php#comment-2063906

--- 18

" @9:

    If a function mutates local variables, but does not mutate global state, for all practical purposes it is just as good as a stateless function. From the callers perspective it is stateless.

Indeed, and Haskell's type system is rich enough to express this constraint. See http://www.haskell.org/haskellwiki/Monad/ST#A_few_simple_examples

Posted by: Dan

November 10, 2009 7:41 PM

" -- http://scienceblogs.com/goodmath/2009/11/philosophizing_about_programmi.php#comment-2064185

me: yes, but... the other guy's point is that having to use the ST monad for that isn't quite as convenient.

toread: http://cs.anu.edu.au/~Ben.Lippmeier/project/thesis/thesis-lippmeier-sub.pdf

---

look AGAIN at closures, coroutines

---

There was more to T than implementation technology; there was also a lot of beautiful language design happening. Jonathan seized the opportunity to make a complete break with backwards compatibility in terms of the runtime library and even the names chosen. Somewhere in the T 2 effort, Kent Pitman, another Lisp wizard, came down to Yale from MIT. He and Jonathan poured an immense amount of design effort into the language, and it was just really, really *clean*. Small (but difficult) things: they chose a standard set of lexemes and a regular way of assembling them into the names of the standard procedures, so that you could easily remember or reconstruct names when you were coding. (I have followed this example in the development of the SRFIs I've done for the Scheme community. It is not an easy task.)

Larger, deeper things: they designed a beautiful object system that was integrated into the assignment machinery -- just as Common Lisp's SETF lets you assign using accessors, e.g., in Common Lisp (setf (car x) y) is equivalent to (rplaca x y) in T, (set! (car x) y) was shorthand for ((setter car) x y) Accessor functions like CAR handled "generic functions" or "messages" like SETTER -- CAR returned the SET-CAR! procedure when sent the SETTER message. The compiler was capable of optimising this into the single Vax store instruction that implements the SET-CAR! operation, but the semantic machinery was completely general -- you could define your own accessor procedures, give them SETTER methods, and then use them in SET! forms.

(This turned out to be very nice in the actual implementation of the compiler. The AST was a tree of objects, connected together in both directions -- parents knew their children; children also had links to their parents. If the optimiser changed the else-part of an if-node N with something like this (set! (if-node:else n) new-child) which was really ((setter if-node:else) n new-child) the if-node:else's SETTER method did a lot of work for you -- it disconnected the old child, installed NEW-CHILD as N's else field, and set NEW-CHILD's parent field to be N. So you could never forget to keep all the links consistent; it was all handled for you just by the single SET! assignment.)

-- http://www.paulgraham.com/thist.html

--- on the need for optional single-use continuations even when multiple use ones are offered: http://www.nhplace.com/kent/PFAQ/unwind-protect-vs-continuations-original.html

good defn of "dynamic contours"?

--- http://axisofeval.blogspot.com/2010/04/dylan-and-lisp-family-trees-central.html

" Monday, April 26, 2010 Dylan and the Lisp family tree's central branch Since the early eighties (beginning with Scheme and T), most Lisps began to settle around a common core.

(This also coincides with the point in time when static scoping was finally understood, once and for all, after a painful and embarrassing history.)

With the exception of Scheme, most Lisps don't have multi-shot continuations. They seriously complexify a language, as can be seen in weird implementation techniques like Cheney on the MTA.

It's also hard (or even impossible?) to make a good UNWIND-PROTECT in the face of multi-shot continuations. And UNWIND-PROTECT is surely one of the most important control flow operators.

So what is the common core I'm talking about? You can see it best in Dylan, I think.

First of all, a Lisp that follows this common core design can be efficiently implemented on real hardware. No need to do weird stuff like stack copying or Cheney on the MTA. In fact, Dylan has, with a bit of squinting, the same dynamic (control flow) semantics as C.

Second, the common core is simply nice to program in.

Some features of the common core:

This core can be seen in languages that I consider the central branch of the Lisp family tree:

All of them are great languages, and worth detailed study.

In the future I hope to write about each of the features these languages share in more detail. Posted by Manuel J. Simoni at 3:53 PM Labels: cl, continuations, dylan, eulisp, goo, islisp, lisp, oop, plot "

---

"You know it is right when both simplicity and power are maximized, while at the same time confusion and the need for kludges are minimized.

PLOT emphasizes cleanliness, flexibility, and extensibility." -- http://users.rcn.com/david-moon/PLOT/page-1.html


more on PLOT:

http://users.rcn.com/david-moon/PLOT/Moon-ILC09.pdf

http://news.ycombinator.com/item?id=537652

--

9 points by gruseom 407 days ago

link

Wow, this is interesting on many levels. The slides from ILC are a good overview: http://users.rcn.com/david-moon/PLOT/Moon-ILC09.pdf. They contain this fascinating statement:

Traditionally, code walking has required ad hoc code to understand every “special form.” It is better to have a well-defined, object-oriented interface to the Abstract Syntax Tree, scopes, and definitions. This is why objects are better than S-expressions as a representation for program source code.

I have never heard anyone point to s-expressions as the reason code walkers are hard to write in Lisp (and they are, at least in CL).

---

http://users.rcn.com/david-moon/PLOT/page-2.html

General Principles " Everything about the language is to be defined in the language itself. The language is to be fully extensible by users, with no magic. Of course to be implementable this requires bootstrapping.

Anything that must be intrinsically known to the compiler is explicitly marked.

Define as much as possible by the binding of names to values.

Use naming conventions in place of Common Lisp's multiple name-binding spaces for functions, variables, types, and packages.

Discourage assignment, and make it syntactically obvious where it is allowed or occurs. But do not forbid it.

Discourage unnecessary type declarations, but allow them where needed.

Use hygienic macros pervasively.

Language-enforced access control is incompatible with macros and with debugging within the language, so do not attempt to provide that type of feature. Use naming conventions and social enforcement instead.

Arithmetic operations must always produce mathematically correct results. No insanity like (x + y) / 2 is sometimes a negative number when both x and y are positive.

Strive for programs to be readable yet concise. Hence use infix syntax, case-insensitive names, and nesting structure indicated by indentation rather than punctuation.

Minimize the use of punctuation and maximize the use of whitespace, for readability.

Avoid abbreviation, but when you must abbreviate, do it consistently.

Strive for simple concepts that combine in powerful ways. Keep removing unnecessary complex features until no more can be removed.

Take full advantage of classes and methods.

Do not conflate the concepts of class, module, scope, and encapsulation. Use simple concepts that combine in powerful ways instead of one overly powerful concept that tries to do everything. "

ELL kernel: http://github.com/manuel/ell/blob/master/KERNEL.org

--

" Here's what Ell is made of:

 ---

http://research.sun.com/projects/plrg/Publications/ICFPAugust2009Steele.pdf

more steele pubs at http://labs.oracle.com/projects/plrg/Publications/


" IMO, Python is farther away from the Lisp genotype than Java. At least Java has post-1980's scoping rules. " -- http://axisofeval.blogspot.com/2010/05/next-lisps.html

"2 comments:

fogus said...

    I'm enjoying your blog so far, even though I get the impression that my leg is being pulled at times. ;-) It's definitely hard to dispute that the specific Clojure code linked to is ugly, but I will say that's it's not representative of the entire look and feel. The annotation support, for better or worse, is a necessity given that Clojure strives to interoperate with Java. Many of the interop forms are less than ideal simply because they require a lot of Java's semantics to taint the pool -- so to speak. Thankfully, the division between interop forms and pure Clojure is very clear and generally allow the ugly bits to be hidden away or outright avoided.
    :f
    Mon May 10, 05:38:00 PM CEST " http://axisofeval.blogspot.com/2010/05/next-lisps.html commenting on http://gist.github.com/377213