Doug Lea discuessed about fork/join framework

最新推荐文章于 2024-10-25 00:34:53 发布

Alex19881006

最新推荐文章于 2024-10-25 00:34:53 发布

阅读量1.3k

点赞数

分类专栏： concurrency

concurrency 专栏收录该内容

9 篇文章

订阅专栏

Doug Lea介绍了Fork/Join框架的演进历程及Java 7中java.util.concurrent的新特性，并探讨了并发编程的未来趋势及不同语言的并行计算方法。

Summary

Doug Lea talks to InfoQ about the evolution of the Fork/JoinFramework, the new features planned for java.util.concurrent in Java 7, and the"Extra 166" package. The interview goes on to explore some of thehardware and language changes that are impacting concurrent programming, andthe effect the increasing prevalence of alternative languages in the JVM arehaving on library design.

Bio

Doug Lea is a professor of computer science at StateUniversity of New York at Oswego where he specialises in concurrent programmingand the design of concurrent data structures. He wrote "ConcurrentProgramming in Java: Design Principles and Patterns", one of the firstbooks on the subject, and chaired JSR 166, which added concurrency utilities toJava.

About the conference

Starting in 1986, OOPSLA Conference has proven to be thecradle of many techniques and methodologies that have become mainstream overthe years: OOP, Patterns, AOP, XP, Unit Testing, UML, Wiki, and Refactoring.Gaining its prestige with 3 academic tracks, OOPSLA Conference has managed toattract researchers, educators and developers every year. The event issponsored by ACM.

My name is Ryan Slobojan and I'm here with Doug Lea, aprofessor at the State University of New York in Oswego. Doug, can you tell usa little bit about the Fork/Join Framework that you've been working on?

Sure. Fork/Join is a little engine that is intended to makeit easier to get high performance, parallel, fine-grained task execution inJava. It's a little bit of a departure for us in java.util.concurrent, becausemainly we've been concentrating on server side asynchronous communication andwe've actually had this framework around. I first created a version of this in1998 and wrote about it then and we've been really waiting for ubiquitousmulticore MP computers to arrive, so that we don't put in something thatdoesn't actually help most people.

We've also had the luxury of having a few years to get itright in the meantime, because we didn't ship it with Java 5.0. It was actuallytriaged off the list of things to put out in java.util.concurrent.

We're also extremely conservative engineers. The code injava.util.concurrent we think real hard before we release and we usually willtake our time and just ship it when it's done.

The basic idea is it's a framework that itself takes alittle bit of getting used to to program directly in. It's a framework thatmost enjoys executing parallel recursively decomposed tasks. That is, if youhave a big problem, divideit in 2, in parallel solve the 2 parts and then whenthey're done, combine the results. It turns out, in the same way that you canbuild anything from recursion, you can build things that operate on arrays, onmaps, on anything out of that. Because we're just a library, that's where westop. We have an engine that has a Fork/Join task, a map operating in Fork/Joinpool, a few other little helper classes here and there. Those people who enjoydoing this - and there are a few - are getting really good results. We havereally excellent performance.It depends of course on the problem you aresolving, but we are capable of producing code with at least as good absoluteperformance and speed-ups as frameworks in other languages.

Part of this is that we have the luxury of building this frameworkonsystems that have, for example, high performance, scalable, concurrent garbagecollection, because garbage collection in these frameworks turns out to be aninteresting issue. If you are implementing them, say in C, you generally haveto slow down a lot of things in order to get the memory allocation right. Inour case, we will spew lots of garbage, but it's very well behaved garbage andit's picked up by the garbage collector because it is unused 99% of the time.

The whole idea is when you think you might do something inparallel, you make a little task, you do this internal very slick work-stealand queue manipulation and then you see if anybody steals it from you. Ifanybody steals it from you it's because they didn't have enough to do when theydecided to do some of your work for you. Once you get the hang of that, there'sa lot of things you can do. We're also though, very conscious these days ofbuilding libraries for platform support. I think that there is upcoming supportin Scala for nicer syntax, for automatically generating some of these recursivedecompositions.

Clojure has been doing this for a while, X10 will be doingthis - the IBM high performance language, Fortress in its currentimplementation uses our framework and several other research projects andexperimental systems. We're providing fine-grained parallels and control forthe languages that run on JVMs, rather than Java in particular, and that'sreally the way I think about this work - that I write it in Java on a VMbecause I can create very efficient algorithms.

I really don't care what syntax people use to access it,which is a little bit of a shift in focus for those of us building libraries inJava, where we were first mainly concerned with getting the Java APIs right andnow we really want to get the functionality broad enough, regardless of whetheryou are programming it in Java or Scala or whatever, that you can stillbenefit.

What are some of the tasks which are supported out of thebox with the Fork/Join Framework?

That's a good question, because the answer in the initialrelease that should be going into one of the open JDK milestones very, verysoon is “Almost nothing". That is, it is very possible to build thingslike arrays for which every time you want to operate you say, “Apply toall" or “Map" or “Reduce", which are increasingly familiarnotions. We don't ship those in Java because we are really unsure of languagedirections. With closures, function types, other miscellany like that, if theyare going to be in the language you would support them in Java very differentlythan the way we would have to otherwise.

We chickened out; we are not going to release the layers ontop of this ourselves.We make them available, so there is a package calledExtra 166 that has all the things that we think are shippable, but we don'tship, because we're very conservative engineers. When you are putting code on abillion machines, you really don't want to make a serious mistake of puttingsomething out that you will somehow need to retract and pull back in a year or2. That means that right now, people who are using this framework are going tobe the people who actually get into this parallel recursive decomposition andknow how to use it.

There are actually many things these frameworks can do. Notonly, for example,are things like parallel For Each loopspossible, but both theClojure and Scala actor frameworks use the same engine with some differentparameter settings than you would for parallel decomposition. The basicwork-stealing framework is available to do a lot of things, not just “Apply toall" for arrays.

Some of the extras that you described seem like they wouldfit very well into an Apache commons type framework. What are your thoughts onthat?

Our main mission is we want to get what we think are thecentral, most important bits of functionality in JDK proper. So again we're alittle conservative- in cases where we weren't as conservativeas we wanted tobe we've already regretted it. There are components that people have asked forfor years that we still don't put in because we are not fully happy with them.

We have many, many requests for a concurrent cache, avariant of a concurrent hash map that has some sort of reclamation policy anddiscards and the like. We're not satisfied with any of our solutions, so wedon't ship it. We do, however, put in our source repositories all the things wethink are shippable that we don't ship. There are people who use it and it isnot as well engineered, but there are many people using several things thathave never made it in JDK.

We have, for example, a really pretty goodnon-blocking,double endedqueue we put together for Java 6.0 and then decidedthat the demand for that was not really enough to outweigh the addedmaintenance burden and API support and all of that for something that reallywasn't used very much. We do that, too. What it means is when you look injava.util.concurrent you see what are our best guesses of what you might wantto use, and again really different audiences.

Classically, when we started out, our by far main audiencewere people doing server side stuff, lots of clients, heavily concurrent, lotsof threads, maintenance, and so we have a lot of components that are reallygood for that level of thing. As multicores and MPs get more prevalent, we wantto support finer-grained stuff. That's where we're heading and then we'll do itwhen we're good and ready.

What are your thoughts on the future direction for theFork/Join Framework?

We have frameworks that are really well tuned for the mediumterm future, the future of dozens to hundreds of cores or CPUs, enough dataparallelism to keep them happy. We don't have remote execution, we don't haveany clustering support: so if you want to split up a large task across severalmulticores each running in a Sun box - we don't do that.

If you would like to try to run more than 1,000 cores, we'rea little scared and nervous because we don't actually know very much about thescalability of some of this. We're obsessive about testing real performance onreal platforms, whenever these are developed.We're testing on the littletwo-way boxes, dual 8-cores Nehalems, dual Niagra 64 thread boxes, Azul boxes,but the next step where people are thinking “Maybe we'll get to 1,000 cores",we're not so sure. There are some scalability issues that are maybe going tomake us rethink how we go about a fair amount of it.

The great thing about concurrent programming is you neverget old. There is an infinite number of new problems that keep attacking you.[For example] the increasing cost of cache misses on multiprocessormulticores:If you have 2 Intel I 7s, then you have a very different box thanwith one of them - a really different box - and we try to put out code thatdoes not cheat and say “Well, let's look at what kind of box we're at", weput out code that we're pretty confident will run on a generation of machines.

Our ability to predict past a few years is non-existent.That's always been so. There are several things that were in the initialrelease of java.util.concurrent that are going to need a little bit ofmaintenance upkeep, because there are engineering trade-offs hiding in some ofthe implementations that don't make as much sense as they used to. We will bedoing a little bit of routine maintenance after all these big things about JDK7 come out,hopefully soon.

There are many languages which have many approaches toparallel computing. In an absolutely ideal universe, given the way thatparallel programmingseems to be progressing with multiple cores, what do youthink is the best, most intuitive approach to parallel programming, from theprogrammer's perspective?

That's an unanswerable question. We had a workshop onteaching concurrency, here at OOPSLA, on Monday and there are many ideas, but Ithink everyone believes that every student - because we were talking aboutteaching - but every developer should have some understanding of coordinatingasynchronous processes. That is maybe you generate threads, maybe you use somelocks, maybe you decide to do it using events and actors where they are sendingmessages. This is a coordination of naturally asynchronous stuff. Why do youcreate a thread? Because you have another client, or because you have anothernode in your simulation of a biological process

It's not because you want to speed up, it's just that's theway the world is and you have to learn how to coordinate it. The much overhyped other side is well you have these things,why don't you make your programsfaster? They are really different points of view and I think everyone needs tounderstand them well, in part because the parallelism for the sake of speed upcan be very easy.

If you have a decent language tool or other support, saying“All the operations in this loop could run in parallel", there arelanguages being devised and several reported on here at OOPSLA in the past fewdays, maybe we can get people to annotate their code, so that they wouldn'thave to guess and we could prove that it was OK for this loop to run inparallel. I'm very much for that, I wish them a lot of success.

My role is a library guy. I don't pretend to know very muchabout language design, but I am very happy to yell at them as long as it takesto see my point of view about what the underlying implementations would like todo. The short answer is everybody should understand data parallelism, appliedto all, For Each,map reduceand that's good because most people do understandthat.

The other aspect is intrinsically a little harder. When youhave concurrency not really of your choosing, then usually it's less regular,you have to use more custom techniques. That is why we have a lot of thesecustom techniques in java.util.concurrent.So we have phasers and barriers andcountdowns and a bunch of things that are all really good solutions to a classof problems, with no aspiration to universality. If you are doing resourcecontrol, use a semaphore. There is a classic way to do that and it works well,don't use any of these other things, please.

So there is a little bit of domain knowledge and a deeperunderstanding of concurrency needed to build a good scalable server thanit isto build a good parallel ray tracer. A good parallel ray tracer iscomparatively easy, but just as important and that's why we're supporting it.We believe that everyone should be able to get parallel speed ups without havingto invent their own work-stealing framework, which took about a decade toproduce the way we like it.

One of the difficulties that I've encountered as a developeris that I can usually reason very easily about the behavior of a sequentialprogram, but as soon as parallel comes in and things can seemingly happenalmost at random, I have much more difficulty with that. How can I reason moreeffectively about a parallel program in Java?

Another good and not completely answerable question. We dothink that there is a good strong chance that languages can evolve to make itso that - one of the buzzwords these days is deterministic parallelism. That isparallelism that gives the same answer every time, at least to the extent youare allowed to know anything about the computation. So if you do a paralleloperation on array elements and all those operations are independent, then youcan tell they've operated in parallel, so long as the thing you'd asked them tooperate on has no weird side effects, doesn't affect globals, doesn't have anyordering constraints.

For those kinds of things, I think there is actuallylanguage help on the way.There are people working on these issues and severalof them use annotation style processing. Some of them you can think of as anextension of Mike Ernst's work on adding more information to types aboutconstantness and purity. I think fairly happy thoughts about that aspect. Theaspect that I think will always be difficult is if you want very highperformance, highly scalable, concurrent data structures, then it's a difficultchallenge.

Most people want to do it, but most people don't want tocreate their own red-black trees or B-trees either. Why do that yourself? Letthose of us who are willing to spend a year finding out how to do a pure lockfree, nonblocking queue implementation. Let us do it, because we love that.That's my favorite thing to do in the whole world - to come up with newnonblocking concurrent algorithms and they are very hard to reason about andthey are very low productivity components. It will take a year, off and on, toput out 500 lines of code, but we really hope that no one else does it.

I do it, people like maybe a Cliff Click does it once in awhile, a few people do it and we invite everyone else, “Please learn enough soyou can join us because we need all the help we can get". But the numberof people who have made it through, Maurice Herlihy and Nir Shavit have a newbook called “The art of multiprocessor programming" and it has a fairlytough reputation already. It's on creating nonblocking algorithms. In a sense,the subtitle of the book is The Underlying Algorithms of java.util.concurrent,because many of them are, but we'd like people to learn about them, we don'twant most engineers to implement them. They have work to do.

What are some of the differences between thejava.util.concurrent libraries and the java.util libraries?

There is a little bit of a difference betweenjava.util.concurrent and java.util, where java.util.concurrent really has thismission - put out the best algorithms, data structures we know. In java.utilhash map or vector or things like that, the whole mind-set is to put outsomething that has no performance anomalies, which is a really differentquestion.

Java.util hash map you could probably - if you have a hashmap full of ints, int keys - do better only if you only use it very carefullyand knowingly in those cases, but if you use them when it's there, maybe you'llget a little bit of blow up because of boxing and things like that, but it is avery regular algorithm and we put high priority on lack of surprise. There is alittle bit of a difference between the concurrent work,where part of it is wespecialize.

We have more kinds ofqueues, for example, than most peopleever want to know about and we apologize that it is a little confusing, as evenworse, we're adding yet another one to Java 7.0. Queues of course are verycentral to the internals of concurrency, because when threads are waiting theyare put in queues, when messages are being sent they are put in queues, whenproducers and consumers are exchanging data, they are put in queues.

So there are a lot of kinds of queues and we don't make toomany apologies for the fact that we are putting in our eighth in JDK 7.0 thatis just right for what it does - it's called a linkedtransfer queue - veryefficient. It includes a brand new, pretty cool algorithm that's an extension,an improvement over others that have been published and has - for those 2people in the audience who want to know - the main virtue is that it can doboth asynchronous and synchronous transfers, so you can send a message andforget about it and don't wait for it, or you can send a message and wait forthe receiver to get it.

We support both of those, which is actually sort of uncommonand algorithmically tricky, but increasingly needed. In fact, we needed it,internally, in the Fork/Join pools. We mainly developed it for ourselves andthen it took a life of its own, as we found that many people actually need thisfunctionality.Not many application level programmers do, but people who arebuilding server side frameworks and the like.

Pasted from<http://www.infoq.com/interviews/doug-lea-fork-join>