Category Archives: Mira ESP: 2010-2013

Archive of blog postings from the ALCF Mira Early Science Program, which ran from 2010-2013.

Minimum partition size on Mira is 512 nodes; maximum backfill is 8192 nodes

As has been explained in a recent email to Mira users, the minimum partition you can use on the machine is 512 nodes. If you request fewer nodes, you still pay from your allocation for all 512, and the unused nodes are idle. On Cetus, the minimum partition size is 128 nodes.

As some of you exhaust your ESP allocations on Mira, you will notice your jobs going into the “backfill” queue. These are queued with low priority relative to positive-allocation-balance jobs, but will run if resources are available and no normal jobs are available to fit the space. The maximim size job allowed in backfill mode is 8192 nodes.

Please test your codes with new system driver on Vesta

This coming Monday and Tuesday (4-5 Feb. 2013), Vesta will be down for extended maintenance, to install the latest BG/Q system driver from IBM (V1R2M0). Eventually, this driver version will be installed on Mira. Please help ALCF and yourselves by building and testing your Early Science codes on Vesta after the upgrade, if you can. Let us know if something breaks.

Cetus Down

There will be an official notice going out soon, but be aware that Cetus is down and will be down for a number of days. This is related to the Vesta downtime—the BG/Q rack that’s currently designated as Cetus is being combined with Vesta to make Vesta a 2-rack system. We have a new rack that will be designated as Cetus. My best estimate is 5 days of downtime for Cetus (yesterday’s notice to vesta-notify and mira-notify lists estimated 5 days downtime for Vesta).

Early Science on Mira is on; time allocations in place

The Early Science period is officially underway. Mira came back online after acceptance testing on the evening of Monday 17 December. After an initial glitch in setting up the computer time allocations for the ESP projects, the correct allocations are now in place. These are what you were awarded as target allocation when your project was selected for the ESP. On Mira, the command

        cbank-list-allocations -u yourUserName -r mira

will show you the amout and usage of your allocation.

Our one-rack test and development machine, Cetus (cetus.alcf.anl.gov) is now also available to ESP users.

The Early Science period should last through mid-March. When there is concrete information about the exact transition date, I’ll send out an email with the date and information about how the transition to production usage will impact the Early Science projects. You should have used up your ESP project allocations by then.

Allocations for *_esp projects are active

Please note that, since the 24 accepted racks of Mira were turned over to Early Science, time allocations for the ESP projects have been in place. The amounts of the allocations are not yet correct; all were set to a placeholder value of 50 million core hours. When ALCF and Mira are ready, we will establish the formal allocations for the projects. These allocations will be based on the target awards from the letters informing you of your Early Science Program awards. Those are:

PROJECT               AWARD (in millions of Mira core-hours)
--------------------  --------------------------------------
GFDL_esp              150
Mat_Design_esp         50
Autoignition_esp      150
Bulk_Properties_esp   150
DarkUniverse_esp      150
MADNESS_MPQC_esp      150
CFDAnisotropic_esp    150
GroundMotion_esp      150
HSCD_esp              150
TurbNuclComb_esp      150
LatticeQCD_esp        150
TurbChannelFlow_esp    60
AbInitioC12_esp       110
NAMD_esp               80
PlasmaMicroturb_esp    50
MultiscaleMolSim_esp  150

Mira access added; usage announcement mailed out

Those of you who responded to the email asking for new ESP users that need to get access to Mira for Early Science runs on the accepted 24 racks of the machine: you should all now have access to Mira (and Vesta), except possibly those just now getting an ALCF account for the first time (you have additional application procedures to do, for which you should’ve received instructions).

Today I sent out an email to mira-early-users with some details about using Mira between now and the start of the 48-rack acceptance testing (around mid November). For those of you in the Early Science Program projects, you can now run jobs up to 16 racks in the “ESP” queue. For 24 racks and higher, you will land in the “ESP-bigrun” queue, which will be manually managed. This should mainly be for scaling tests, not scientific runs, since when you run on 24 racks and higher you’ll be using some of the unaccepted Mira nodes, and can’t expect reliability as you get on the 24 accepted racks (where all your jobs of 16 racks and less will run).

Remember that on BG/Q, 1 rack is 1024 nodes (same as BG/P), but is 16K cores (as opposed to 4K cores on BG/P).

Mira access for general ESP users—send names this week

A reminder to send me information about additional from your ESP project teams that you’d like to be able to access Mira starting next week.

In order to get on the list I’ll be handing off to User Services, you need to send in the information by the end of this week (tomorrow, Friday 4 Oct). Send to [email protected] .

If you already have Mira access, you don’t need to do anything. The information I need to get someone Mira access is:

  • First, Last name.
  • ALCF username
  • email address
  • ESP project (short name, from list below)
PI                      Project Short Name
Venkatramani Balaji     GFDL_esp
Larry Curtiss           Mat_Design_esp
Christos Frouzakis      Autoignition_esp
Mark Gordon             Bulk_Properties_esp
Salman Habib            DarkUniverse_esp
Robert Harrison         MADNESS_MPQC_esp
Kenneth Jansen          CFDAnisotropic_esp
Thomas Jordan           GroundMotion_esp
Alexei Khokhlov         HSCD_esp
Don Lamb                TurbNuclComb_esp
Paul Mackenzie          LatticeQCD_esp
Robert Moser            TurbChannelFlow_esp
Steven C Pieper         AbInitioC12_esp
Benoit Roux             NAMD_esp
William Tang            PlasmaMicroturb_esp
Gregory Voth            MultiscaleMolSim_esp

ESV two-week Mira access now estimated 2nd 2 weeks of August

The current revised estimate for the two-week period when ESP projects will have acces to the full Mira for testing codes at scale (ESV period), is the last half of August, possibly starting around August 20. Please plan for this. (I still haven’t heard from most of the ESP projects who their 1 or 2 ESV users will be. Please email [email protected] with this information.

[Update 10/3/2012: As indicated in a couple of mass emailings to ESP project participants, the plan changed and there will be no specific 2-week ESV period. We have accepted 24 of the 48 racks, and access will be extended starting the week of Oct. 7 to all ESP project participants to run science on those 24 racks.]

Who do you want to access Mira during the ESV?

So far, I’ve only heard back from 3 projects about the one or two people designated to get access to Mira during the ESV window. Please think it over and send me ([email protected]) the names of those for your project.

Here’s a snip from the email of June 4:

There will be a two-week period (called the Early Science Verification – ESV), prior to the Mira acceptance-test period, when the Early Science codes can be run on the system. Their running correctly is a prerequisite for starting the acceptance testing (assuming the problem is with the system, and not the code or input decks). The ESV is an important milestone for the Early Science Program, the ALCF, and IBM. Please be prepared to exercise your application at scale—ideally running problems comparable with your planned production runs during the Early Science period. This is also a good opportunity for you to get your first multi-rack testing on Mira done.

Our best estimate for the intended start of acceptance testing is mid August, which places the ESV in the first two weeks of August. We expect to have the login/compile servers ready for your access at least one week prior. Please prepare your codes, and plan for members of your teams to run these tests. Only 1 or 2 key people from each project will have access to Mira in this period, so please identify those people in advance.

Current estimate is that ESV will start later—second half of August or after.