JustinPrivitera open issue Alpine-DAV/ascent#1563.
Original vs Actual Extents
Original mesh:…View Comment
JustinPrivitera open issue visit-dav/visit#20449.
VisIt crashes when rendering dataset using transparency
I am trying to visualize my x ray imaging setup, so I am making renders of two files, curv3d.silo
and output.root. When I adjust opacity on my rays mesh and then click the viewer, VisIt crashes….View Comment
garlick open issue flux-framework/flux-accounting#671.
allow urgency weight to be reconfigurable
Problem: when the flux accounting priority plugin is in use, the urgency
(described in RFC 30 has a very small impact on priority, which means that users cannot meaningfully lower the priority of their jobs (other than hold) relative to others, and sys admins cannot raise the priority of jobs….View Comment
jeremylt closed issue CEED/libCEED#1822.
GPU AtPoints Assembly
Following up from https://github.com/CEED/libCEED/pull/1799…View Comment
BradWhitlock open issue LLNL/conduit#1444.
Partitioner fails when multiple topologies provide adjsets.
I wrote a new test case to test adjset spatial sorting, which makes a mesh and generates lines and corners for it and validates the adjsets. Later, I decided to hook up partitioning. I made the partitioned mesh after the other meshes. It caused the partitioner to crash because topologies (other than the one being partitioned) had adjsets already defined. The logic loops over all adjsets in the domain, which is not right….View Comment
cmoussa1 open issue flux-framework/flux-accounting#666.
documentation: update docs to include information on resource limits
Now that #562 has landed, the docs need to be updated with information on how per-association resource limits are enforced in the priority plugin and how they can be configured both in the database. Perhaps a separate page just on limits could be worthwhile….View Comment
mplegendre closed issue hpc/Spindle#86.
Renames logging #defines
New logging #defines did shell exit and return respectively. Names should be changed to reflect flow control. …View Comment
wihobbs open issue flux-framework/Tutorials#51.
Flux for Sysadmins
At dinner with @trws last night, we discussed the need as Flux grows for a few additional tutorials that go beyond the traditional user training we’ve had in the past: Flux for Sysadmins and Flux for Developers and Hackers….View Comment
btalamini closed issue LLNL/serac#1399.
Make sure mesh is guaranteed to be in a valid state
Even if given a parallel mfem mesh….View Comment
BradWhitlock open issue visit-dav/visit#20420.
Improve Subset wireframe rendering
Improve Subset wireframe rendering to eliminate Z-fighting and improve overall appearance….View Comment
chu11 open issue flux-framework/flux-core#6872.
kvs-watch: support a sentinel value indicating "end of initial values"
It would be convenient if kvs-watch could inform watchers that all data that is presently available (i.e. data may appear in the future but we don’t know) has already been streamed and sent….View Comment
artv3 open issue LLNL/RAJA#1856.
Warnings with GCC 12.1.1
There a number of warnings that appear with GCC 12.1.1. They look something like this:…View Comment
alecbcs closed issue spack/spack-tutorial#191.
Add note about `spack help --spec` to Basics tutorial
The spec syntax we cover can be confusing for new users, and it’s good to know that we have a command just to describe this syntax. We should end the tutorial with a note about spack help --spec
….View Comment
JustinPrivitera closed issue visit-dav/visit#20411.
Update AMR test baselines
Update baselines once #20385 is merged….View Comment
garlick closed issue flux-framework/flux-core#6816.
broker: use IPC when taskmap indicates peer is on the same node
Problem: co-located brokers do not use IPC to communicate unless all brokers are on one node….View Comment
cyrush open issue Viskores/viskores#106.
Strategy for CastAndCall errors
Hi All,…View Comment
jeremylt open issue CEED/libCEED#1830.
CeedVectorFilter
@zatkins-dev would it be adequate to your purposes to add this in libCEED? (Though you could access it via PETSc also)…View Comment
jameshcorbett open issue flux-framework/flux-coral2#364.
Allow rabbit jobs to tolerate loss of some rabbits and compute nodes during prolog
@behlendorf recently found that with large rabbit jobs, he was frequently finding that one or two rabbits would fail to create their file systems, or one or two compute nodes would fail to mount. This was a persistent problem, even with all the recent changes to mark nodes as badrabbit
if they fail to mount. Perhaps this is because not many other rabbit jobs had been submitted, which might have helped filter out some of the bad nodes, or perhaps at that scale it simply becomes likely that something will fail; in any case, it would be useful if the rabbit prolog could proceed in spite of some user-specified number of failures. For example, a user might specify that they want 5050 nodes, and can tolerate the loss of 50 of them, or something similar….View Comment
wihobbs closed issue flux-framework/flux-core#6837.
Maybe the frobnicator should add `/tmp/` as `TMPDIR` to user environment when no environment is set
Slightly LC-specific issue, but Spindle breaks when the TMPDIR
environment variable isn’t set. Since by default jobs submitted through the Python API have an empty environment by default, maybe we should consider setting TMPDIR
to /tmp/
if the user hasn’t set it already….View Comment
cyrush open issue LLNL/conduit#1437.
consider changing CONDUIT_EPISLON to std::numeric_limits
vsoch open issue rootless-containers/usernetes#373.
Infiniband for older kernel
We’ve been able to get Infiniband working with Usernetes, primarily using UCX and then having the devices /dev/infiniband
bound from the host. We have a setup of usernetes on on-premises (our first on a production cluster and not in VMs alongside) and what I’ve found is the avenue to bind devices and then use ibverbs and ucx works up until the point it needs ulimit -l to be unlimited:…View Comment
mplegendre open issue hpc/Spindle#83.
Handle interception of openat and related calls
Need to investigate whether we can safely intercept and handle openat (and the related *at family of I/O calls). We would need to do so without breaking the atomicity of the calls, which will be a challange. But we’re also missing I/O redirection when applications use openat….View Comment
markcmiller86 closed issue visit-dav/visit#20403.
Consider adding cmocean color maps
E3SM team would like to have the CMOcean colormaps available in VisIt……View Comment
cyrush closed issue visit-dav/visit#20401.
command line option to control control visitlog.py file
### Is your feature request related to a problem?…View Comment
vsoch closed issue flux-framework/flux-python#14.
Flux-Python fails to build on LC systems because of symlinks
During the tutorial dry-run today, we learned that running python3 -m pip install flux-python
fails on Tuolumne. I’ve dug into this a bit more and found the issue….View Comment
jedbrown open issue RustNSparks/rocm-rs#7.
Soundness issues
At a glance, I see what look like many soundness issues. For example, Device::reset(&self)
cannot guarantee the prerequisites:…View Comment
wihobbs open issue flux-framework/flux-core#6826.
add `__repr__` to JobInfo
```python3…View Comment
cmelone open issue LLNL/hubcast#126.
delay in update of git refs causes race condition in `sync_pr` and `sync_branch`
This was written to ensure there weren’t duplicate calls to send_pack
by the sync_branch
and sync_pr
functions:…View Comment
jeremylt open issue bempp/kifmm#148.
Undocumented Code
This issue is for the JOSS review https://github.com/openjournals/joss-reviews/issues/7124…View Comment
artv3 closed issue LLNL/RAJA#1830.
Fix pragma warnings
Here’s the warning I see on Lassen with cuda-11.3 and gcc-8.3.1:…View Comment
cyrush closed issue Alpine-DAV/ascent#1512.
t_ascent_render_3d test consistently outside tolerance on LC toss4 machines
These two tests fail on rzhound and poodle. They are just outside the allowed image diff tolerance….View Comment
tomstitt open issue Alpine-DAV/ascent#1517.
Unable to build with ROCm 6.4.0
We are seeing an ambiguous namespace error when we build with ROCm 6.4.0 we don’t see with other compilers (maybe it is just that newer llvms are pickier?):…View Comment
garlick open issue flux-framework/flux-core#6807.
broker: available TBON topology options are not great for resilient batch jobs
Problem: as noted in #6806, performance issues have been observed with large batch jobs run with -Stbon.topo=kary:0
(flat topology). This is currently recommended when resiliency is required, to minimize the number of critical router nodes. Unfortunately, there is no convenient way just add a few router nodes to an otherwise flat topology to alleviate the bottleneck on rank 0. You have to go all in with kary or binomial….View Comment
JustinPrivitera open issue Alpine-DAV/ascent#1513.
X Ray Filter Enhancements
### High Level Goals…View Comment
hppritcha open issue open-mpi/ompi#13242.
bindings: python check in configury needs to be improved
There’s a check in ompi_configure_options.m4
for python 3.6 or higher to support generation of the top level c and fortran f08 bindings from the code in ompi/mpi/bindings. That’s fine, but the file being used as a sentinel to tell whether or not the bindings have already been generated is not correct. the names of the generated files has changed and ompi_send.c is not being generated, rather a differently named file. …View Comment
emily-howell closed issue Alpine-DAV/ascent#1400.
ascent crash while trying to write image to non existent directory
We think that using: …View Comment
vsoch open issue pytorch/benchmark#2613.
Support for multiple nodes?
Hi! I know that torchrun can handle running across nodes - do you have an example for how to do that here? Thank you!…View Comment
jameshcorbett open issue flux-framework/flux-coral2#356.
rabbit: preset #DW strings are restrictive
Problem: someone attempted to add a preset rabbit #DW string like the following:…View Comment
wihobbs open issue flux-framework/flux-accounting#620.
turn off Python tracebacks by default, optionally enabled with `PYCLI_LOG_LEVEL=*`
I was looking through old ServiceNow tickets in LC and saw a failure:…View Comment
garlick open issue chaos/diod#138.
add tests for Tremove/Trename
@JJL72 said …View Comment
btalamini closed issue LLNL/serac#1352.
Finite element dual assignment not working properly
https://github.com/LLNL/serac/blob/3ae05c23b51dad2b9256966def0cfe535bab1b4b/src/serac/physics/state/finite_element_dual.hpp#L58…View Comment
BradWhitlock open issue LLNL/axom#1565.
Enhance ELVIRA MIR so it can return clipping plane offsets.
The ELVIRA MIR algorithm can return a “normals” field on the output Blueprint mesh that contains the orientation of the normals for the clipped fragment. This feature is activated by passing options["normal"]=1
to the algorithm via the options node. This was originally for debugging/visualization. I have a request to be able to return both the normals and the plane offset….View Comment
vsoch open issue EnzymeAD/Enzyme#2298.
Examples, tests, or FOM?
Hi! I build enzyme with spack, and I’m looking for an example or test that we can run on multiple nodes at increasing sizes to assess performance or benchmark, ideally with time to completion or some other FOM. Can you make a suggestion? Thanks!…View Comment
mplegendre open issue LLNL/Adiak#36.
Discussion: Adding timestamps to adiak metadata
We’ve been asked about adding timestamps of data collection time to adiak metadata. We could do this, perhaps configured by the tool. Alternatively, the tool could subscribe to callbacks and track the timestamps when the callback occurs itself….View Comment
markcmiller86 closed issue visit-dav/visit#20375.
Unreasonably small major spacing for ticks on 2D axes crashes viewer
### Describe the bug…View Comment
artv3 closed issue LLNL/RAJA#1822.
Fix new compiler warnings
Recent changes to the file include/RAJA/pattern/params/forall.hpp generate a bunch of compiler warnings, for example with clang compiler. Please resolve these….View Comment
vsoch closed issue flux-framework/flux-core#6713.
flux job updated duration is not honored
I have a createsims job that I was testing the flux update duration
on, and it successfully changed the jobspec, but the job has happily continued running after 25 minutes (1500 seconds). I set it around 16 minutes so I don’t think it’s any kind of race condition. The job started with the default (0). Did I do something wrong?…View Comment
grondo open issue flux-framework/flux-core#6784.
job-exec: signal behavior on timeout and cancel doesn't match docs
Things are still a bit wonky in the job-exec with regard to how and when signals are sent to jobs after a fatal job exception (including timeout)….View Comment
chu11 open issue flux-framework/flux-core#6782.
flux-dump: improve performance w/ asynchronous calls
while working on #6589 (which code is initially copied from flux-dump
) I noticed most of the calls to the content modules are synchronous. In most cases this is probably ok (I’m not sure we’d want to walk multiple directory paths in parallel, but that could be debated)….View Comment
vsoch open issue abinit/abinit#86.
Undefined reference to dgemm
Hiya! I’m trying to build this, and using most dependencies from spack, and I hit this error with make -j 4
…View Comment
jandrej open issue mfem/mfem#4824.
PetscParVector doesn't create vector with proper global offsets
```…View Comment
BradWhitlock open issue LLNL/conduit#1429.
generate_corners() does not work for "mixed" unstructured topology.
generate_corners()
uses the ShapeType and ShapeCascade classes to help figure out how to decompose the mesh into faces/edges/points. When you give “mixed” to ShapeType, it initializes with -1 for dims and other fields. This crashes later on in ShapeCascade when attempting to use dims as an array index, ultimately leading to a seg fault….View Comment
wihobbs open issue flux-framework/flux-docs#299.
The double copyright
grondo open issue flux-framework/flux-core#6773.
jobtap: jobs get errors from `job.new` with little detail on system instance restart
On Tuolumne, we saw the following errors from a series of jobs on restart. Unfortunately, there’s no detail on which plugin or what exactly when wrong:…View Comment
cyrush open issue Alpine-DAV/ascent#1503.
project_2d + extract keeps extra topology data
use case:…View Comment
JustinPrivitera closed issue visit-dav/visit#20340.
Build Script: Git clone may fail silently
### Describe the bug…View Comment
btalamini closed issue LLNL/serac#1356.
High-order meshes read in from files drop to first order
When Serac reads in meshes from formats that support higher-order elements (such as Genesis), the mesh coordinate field is always linear. Curved adges/faces are lost, and the elements become straight-sided….View Comment
cmelone open issue LLNL/hubcast#121.
Bot functionality
## Problem/Opportunity Statement…View Comment
jeremylt open issue freeCodeCamp/freeCodeCamp#59653.
Perpetuating Gatkeeping Rhetoric
### Describe the Issue…View Comment
alecbcs closed issue LLNL/repligit#5.
Add coverage checks
## Problem/Opportunity Statement…View Comment
adayton1 open issue LLNL/axom#1546.
Add shaping support for grain
My application defines a grain as a centroid and a list of facets (planes?) each defined by 3 points. There’s probably a better name than “grain” for this….View Comment
jedbrown open issue flamegraph-rs/flamegraph#371.
offline post-processing of collapsed stacks without running a workload
I often need to analyze collapsed stacks produced using different tools or on different systems. This can be done today with a slightly awkward command (cf. https://github.com/flamegraph-rs/flamegraph/issues/259#issuecomment-1579859541)…View Comment
alizma open issue METHODS-Group/DRDMannTurb#160.
Benchmarking speed and GPU use as a regression test
The current test suite for the package has some basic test that just checks PyTorch’s observed GPU utilization when an NN is being trained and ensuring that the percentage is high. The intention was to automatically detect if new features to the core of the package were introducing CPU-GPU device communication and/or messing with data layouts. But the way this test currently works isn’t all that useful since only the most basic configuration is checked while many use-cases where incorrect data-loading might reduce performance are just not treated, and moreover, GPU utilization is a misleading metric. Basing a test on MFU, as discussed in that post, would not only be more useful for benchmarking the code’s speed and resource use, but also serve as a better regression test. …View Comment
jameshcorbett closed issue flux-framework/flux-core#6751.
`post-job-event` crashed `job-list` module on hetchy
On Hetchy, I ran sudo -u flux flux post-job-event fwc7axjckQ7 exception type=dws-epilog-timeout severity=0
…View Comment
cyrush open issue Alpine-DAV/ascent#1498.
C++17
Add ci tests that leverages C++17 for all tpls…View Comment
hppritcha closed issue open-mpi/ompi#13171.
HAN leaks memory
There are various cases where HAN leaks memory. At my site, users are complaining about memory leaks with MPI window creation but the underlying problem has to do with releasing of resources retained other components , which is not being done correctly….View Comment
chu11 open issue flux-framework/flux-core#6745.
t: audit and remove && chained background processes
In a number of locations this pattern exists…View Comment
grondo open issue flux-framework/flux-core#6737.
flux-proxy: improve error message when `TMPDIR` does not exist (or just fall back to `/tmp`?)
The error from flux proxy
when TMPDIR
doesn’t exist could be more helpful:…View Comment
tgamblin closed issue spack/spack#49717.
Spack failing with unhelpful message
### Steps to reproduce…View Comment
JustinPrivitera open issue LLNL/conduit#1405.
polyhedral mixed transform inside of Blueprint reader in VisIt should live in Conduit
jeremylt closed issue CEED/libCEED#1732.
Run-time API for compile-time provenance
xSDK policy specifies…View Comment
bwpriest open issue LLNL/ygm#295.
Difference in signature for `async_visit` and `local_visit` for YGM containers
I’ve noticed that local_visit
passes the lambda function by reference, whereas async_visit
visit passes it by value for YGM maps and arrays, e.g. local reference and async value. This means that it is necessary to use the syntax:…View Comment
benzwick open issue KitwareMedical/SlicerCustomAppTemplate#96.
Cannot hide the module panel dock widget on start up
I am trying to develop a Slicer Custom App without a module panel….View Comment
JustinPrivitera open issue LLNL/conduit#1397.
I think the strided structured example mesh is wrong
The strides for element associated fields should be in terms of elements. This example makes it seem as though strides are always the same for topologies, vertex-associated fields, and element-associated fields. I think we should change the example to make it clear….View Comment
alecbcs closed issue spack/spack#48024.
`Package.extendee_spec` does not respect conditional `extends` when extended package is transitively dependent
### Steps to reproduce…View Comment
jeremylt closed issue jeremylt/LFAToolkit.jl#66.
Documentation Bug
We should revert disabling the strict checks when running makedocs
when https://github.com/JuliaDocs/Documenter.jl/issues/1883 is fixed. Until then we should manually double check in PRs that docstrings are added and included in the manual when new methods are created….View Comment
alecbcs closed issue LLNL/repligit#1.
Switch to the Ruff Formatter / Linter
## Problem/Opportunity Statement…View Comment
JustinPrivitera open issue visit-dav/visit#20319.
Add `--bonus` list of tpls
We have --required
, --optional
. But together they do not have all of the tpls. I propose we add another list --bonus
that includes everything else….View Comment
emily-howell closed issue Alpine-DAV/ascent#1413.
camera zoom of 0 causes hang
Accidentally had camera zoom set to 0 (which doesn’t make much sense) and noticed that my render hung. …View Comment
trws open issue flux-framework/rfc#452.
Rv2 - generic resource tree + addons discussion issue
We’ve had a few discussions about this, and after noodling on it for a while I thought it would help to get us talking to put together something we can poke at. Here’s the example from the Rv1 RFC, with the hostname changed to disambiguate for later:…View Comment
cyrush open issue LLNL/conduit#1395.
improve perforamce of to_unstructured
move to use accessors….View Comment
IanLee1521 open issue elastic/kibana#214690.
[Uptime] Allow sorting columns in Uptime UI
Describe the feature:…View Comment
tgamblin closed issue spack/spack#49403.
spack.yaml: use of include causes AssertionError: Included scope ... never added to configuration!
### Steps to reproduce…View Comment
grondo open issue flux-framework/flux-core#6712.
SIGKILL isn't sent to stuck job shells on elcap
During a full system DAT on elcap we noticed that jobs within a batch job were stuck in CLEANUP for 15m….View Comment
jameshcorbett open issue flux-framework/flux-core#6710.
Support for non-v1 jobspecs
Problem: flux-coral2 creates non-v1 jobspecs that have resource sections like:…View Comment
adayton1 open issue LLNL/axom#1521.
Inlet feature requests
I believe my application will need the following things from Inlet, and they should be generally useful for other applications as well:…View Comment
markcmiller86 open issue visit-dav/visit#20302.
VisIt CLI methods should tag themselves somehow with a special attribute
When writing general python logic that scans various objects in the current name space (our help does this), it would be nice to be able to easily tell the difference between object VisIt defines either via C extension modules or even pure python modules and other python objects either from python proper or from other python modules….View Comment
chu11 open issue flux-framework/flux-core#6709.
job-info: cancellation lists do not scale well
Very similar to #6995, the loops in watchers_cancel()
and guest_watchers_cancel()
, which iterate through all current watchers can be slow and scale poorly….View Comment
garlick open issue flux-framework/flux-coral2#323.
[PE_0]: inet_listen_socket_setup:bind() failed [fd=3, port=1371 err='Address already in use']
Problem: PMI port allocation is likely to result in occasional failures…View Comment
balos1 closed issue LLNL/sundials#667.
idaklu solver can't be found
when compile and install the sundials on armv8
cmoussa1 open issue flux-framework/flux-accounting#593.
`update-usage`: past usage factors not actually getting applied, not going all the way back to last usage period
While doing some random experimentation on fluke
, I noticed that the past usage period job usage factors for associations was not actually getting updated. Only the most current period was being calculated. I think this is because the apply_decay_factor()
function was not actually committing any of the UPDATE
statements it was executing. I need to add a .commit()
to the end of the UPDATE
s….View Comment
cyrush open issue visit-dav/visit#20284.
movie wizard fails when selecting unsupported output type
### Describe the bug…View Comment
trws open issue flux-framework/flux-sched#1358.
remove use of `nodes_up` in favor of using the aggregate counts
This is, as far as I know, only used in the request_feasibility
check I added not too long ago. It managed to get out of sync due to a bug in update that @milroy recently found, and should be removed so we can have less caches to worry about. My current thought is we should make it a method and in there use the aggregate filter to get the count, that way if we pivot on it later we can avoid having to track it down again….View Comment
grondo open issue flux-framework/flux-core#6692.
need a convenient way to determine the reason a job was unsuccessful
In a resilient batch job, a user may want to run a job in a loop until it is no longer feasible, or they may want to retry jobs that were unsuccessful due to node failure. Currently, there isn’t a great way to do either of these….View Comment
jameshcorbett open issue flux-framework/flux-coral2#311.
`dws2jgf` does not apply properties to all nodes
Problem: when flux-dws2jgf
writes out nodes, it neglects to output properties for all nodes that are not in DWS. For instance, on Tioga, tioga[42-43]
are not associated with a rabbit, but they are listed in resource.toml
and are assigned to queue mi300a
. But flux dws2jgf
wrote them out without assigning any properties whatsoever….View Comment
chu11 open issue chaos/gendersllnl#14.
compat: reitre ancient hostlist lib
copying…View Comment
chu11 closed issue chaos/genders#92.
genders.spec: does not handle python3 alone if building w/ debug packages
If building rpms w/ the spec file and only python3 is being built (vs python2 & python3) there is a logic error in the spec file IF debug packages are being built. The error will be:…View Comment
wihobbs open issue codecov/feedback#665.
Bug Report: a 90-day lifetime of comments on deleted branches or closed PRs seems to cause a race condition
Describe the bug…View Comment
BradWhitlock open issue visit-dav/visit#20273.
Blueprint reader can mess up fields for point meshes.
### Describe the bug…View Comment
JustinPrivitera closed issue LLNL/conduit#1377.
Help about defining the connectivity
I am trying to construct mesh using conduit. I checked the documentation (https://llnl-conduit.readthedocs.io/en/latest/blueprint_mesh.html#examples) and the examples but I could not figure out how connectivity needs to be defined. In this case I could have either tri
or quad
type elements in my mesh. So, I am trying to construct like following,…View Comment
jedbrown open issue coreylowman/cudarc#340.
soundness with async copies
The non-sync
copy interfaces do not hold onto the host memory, so subsequent modification of the host memory can race with the DMA transfer to device. This is an example with htod_copy_pinned
#336. Small sizes will spuriously pass this test due to what looks like eager copying, but N=10000 here fails every time in my tests (cuda-12.8, sm_89)….View Comment
JustinPrivitera open issue visit-dav/visit#20268.
Sort by date in open file dialog
https://teams.microsoft.com/l/message/19:f2ed7be3682d40d1b8e038744e500a09@thread.skype/1740678263001?tenantId=a722dec9-ae4e-4ae3-9d75-fd66e2680a63&groupId=70162982-9587-4bcc-ad53-20178c76fe11&parentMessageId=1740678263001&teamName=VisIt&channelName=General&createdTime=1740678263001…View Comment
vsoch open issue converged-computing/state-machine-operator#6.
A few TODO for state machine operator
These are from my personal notes - not high priority so putting them here….View Comment
chu11 closed issue chaos/genders#70.
genders: slow on elcap
The admins have noticed that nodeattr
(and by consequence libgenders
) is a bit slow on elcap. The size and the buildup of data structures within libgenders is probably just eating up too much time. As an example with this genders db:…View Comment
tgamblin closed issue spack/spack#44289.
`python+tkinter` cannot be concretized with --fresh
### Steps to reproduce…View Comment
v-dobrev closed issue mfem/mfem#4683.
AddMultTransposeMF inverted in SumIntegrator
Hi MFEM developers,…View Comment
grondo open issue flux-framework/flux-core#6658.
missing `flux-restart` event in job eventlog after restart
There was an interesting difference in a couple job eventlogs on elcap after a Flux restart. Both of these jobs should have contained a flux-restart
event at around the same time:…View Comment
JustinPrivitera open issue visit-dav/visit#20250.
`build_visit` invocation fails with mpich enabled on LC
I figured this would work now that #19600 had been addressed. mpich
builds but we fail to build VisIt itself….View Comment
jeremylt open issue freeCodeCamp/freeCodeCamp#58931.
Book Organizer Misleading Instructions
### Describe the Issue…View Comment
chu11 open issue chaos/genders#79.
Consider new value returned from genders_getmaxnodelen()
The value returned from genders_getmaxnodelen() requires us to call strlen() on all hosts and iterate over all host ranges….View Comment
bhatele open issue pssg-int/pssg-plots#4.
line plot
- [ ] Should have three lines in the example plot…View Comment
cjekel open issue cjekel/piecewise_linear_fit_py#125.
update to new numpy random seed generator normals
It looks like numpy random has gone over a large overhaul. See https://numpy.org/doc/2.2/reference/random/generated/numpy.random.seed.html…View Comment
cyrush open issue LLNL/zfp#253.
generic seralization api needs to be reworked
To serialize / deserialize, the docs show we need to declare a zfp::array::header
instance, however it is now an abstract class:…View Comment
markcmiller86 open issue LLNL/Silo#436.
Merge 424 and 425 to 4.11RC
samuelkgutierrez closed issue hpc/quo-vadis#286.
Explicitly mention numbering type in output and/or switch to logical numbering
In current tests, the internal function qvi_test_emit_task_bind
uses
…View Comment
garlick open issue flux-framework/flux-core#6645.
malleability of flux instance: implement shrink
Problem: Flux instances are not malleable; that is their size cannot be adjusted at runtime….View Comment
jameshcorbett open issue flux-framework/flux-coral2#278.
docs: add `getrabbit` documentation
Problem: there is no documentation for the flux getrabbit
command. If any users hear about it will only be through word-of-mouth….View Comment
IanLee1521 open issue elastic/integrations#12761.
[Elasticsearch]: New index_pivot transform isn't starting
### Integration Name…View Comment
jeremylt open issue CEED/libCEED#1753.
AtPoints GPU Transpose vs TransposeAdd
“Ratel is too slow” as I say -> we do TransposeAdd but should have a Transpose option too…View Comment
cmoussa1 closed issue flux-framework/flux-accounting#523.
documentation: add `flux-account(1)` man page
flux-accounting doesn’t have a manual page for all of its various commands, and since it has so many, there should be some more extensive documentation for these commands other than just the standard flux account --help
….View Comment
IanLee1521 open issue elastic/kibana#210124.
[Observability][Infrastructure][Hosts] Normalize Network utilization graphs to per-second
Kibana version: 8.16.1…View Comment
markcmiller86 open issue LLNL/Silo#435.
Documented unallowed characters in various contexts
- Semicolon in string arrays (coordnames, matnames, species names, etc.)…View Comment
adayton1 closed issue LLNL/CHAI#294.
The GPU thin mode does not respect allocators
In the single memory space paradigm, there is no pointer record, so the allocators passed to the constructor of ManagedArray are thrown away. Looks like a pointer record needs to be generated in this case….View Comment
jeremylt closed issue CEED/libCEED#1737.
De-duplicate Gen Basis Matrices
Follow-up on #1735 …View Comment
cyrush closed issue Alpine-DAV/ascent#1431.
add actions that control logging options
Actions that control logging to give users more control than just params passed via ascent::open
…View Comment
btalamini closed issue LLNL/serac#979.
Single `.hpp` for inclusion in subproject
User codes would like to include a single header for simplicity….View Comment
jameshcorbett open issue flux-framework/flux-core#6596.
housekeeping: misplaced partial release
@milroy and I were seeing some partial-cancel errors from Fluxion even though we thought we had disabled partial release. I was taking this (mistakenly?) as an indication that partial release was disabled:…View Comment
cyrush open issue Alpine-DAV/ascent#1452.
add domain_id field to project_2d (scalar rendering)
for diagnostics…View Comment
jeremylt open issue freeCodeCamp/freeCodeCamp#58455.
Palindrome Project Confusion
### Describe the Issue…View Comment
cjekel open issue cjekel/piecewise_linear_fit_py#123.
remove pyDOE for LHS
See discussions https://github.com/cjekel/piecewise_linear_fit_py/issues/120…View Comment
garlick open issue flux-framework/flux-sched#1329.
man: firstnodex is undocumented
Problem: just happened to notice that the firstnodex
match policy is missing from flux-config-sched-fluxion-resource(5)….View Comment
garlick closed issue flux-framework/flux-coral2#222.
hang in MPI_Init with unbalanced ranks
as described in https://rzlc.llnl.gov/jira/browse/ELCAP-705: …View Comment
cyrush closed issue LLNL/conduit#1357.
Building conduit with adios support is failing under Spack
I tried to compile conduit (0.9.2
) with adios support using spack package manager and it is giving build error like following,…View Comment
vsoch closed issue hpc-social/jobs#1.
Sorting problem on dates
Clicking the sort arrow for date poseted resulted in this odd ordering. I looked over the code but didn’t spot the problem right away Looks like it’s an alphabetic rather than a date-order sort issue. …View Comment
tgamblin closed issue hpsfoundation/tac#24.
Update `lifecycle_policy.md` to state governance document requirement for non-Core projects
In today’s TAC meeting, we learned that all LF projects must have a governance document. I was worried I’d missed something in the lifecycle_policy.md document, but re-reading it, I think that requirement is not stated. Specifically, it seems to say: …View Comment
hppritcha closed issue open-mpi/prrte#15.
Add prefixing capability to prrte library that will be used by Open MPI's mpirun
A next step after issue #10 is addressed is to enable similar prefixing of the libprrte.so binary. This should be able to leverage much of the infrastructure implemented in the solution to #10 but would be better down as a subsequent step.
…View Comment
hppritcha closed issue open-mpi/ompi#12712.
refactor/rewrite mpirun to not exec a second executable
As part of the Open MPI fork of PRRTe migration into the Open MPI source tree proper, it would be desirable to rework Open MPI’s mpirun tool to not exec the prterun
executable.
…View Comment
cmoussa1 closed issue flux-framework/flux-accounting#530.
Add reporting for historical cpu utilization
I am looking for some reporting command similar to Slurm’s sreport. From the man page, “sreport is used to generate reports of job usage and cluster utilization for Slurm jobs saved to the Slurm Database, slurmdbd”. The sreport command allows the user to specify different report types with options. …View Comment
jeremylt closed issue CEED/libCEED#1731.
CI: switch to native arm64
We’ve been using run-on-arch-action
(which runs virtualized under QEMU) to test architectures for which GitHub does not provide native hardware (namely, aarch64, ppc64le
). We can now switch to native Linux arm64. https://github.blog/changelog/2025-01-16-linux-arm64-hosted-runners-now-available-for-free-in-public-repositories-public-preview/…View Comment
tzanio closed issue mfem/mfem#4654.
miniapps are not being built ?
According to the web page …View Comment
hppritcha open issue mpi-forum/mpi-issues#871.
dynamic process management: table 11.1 associated with section on 11.4.1 tries to escape the chapter
We need a latex guru to fix the location of table 11.1. It used to be close to the the section 11.4.1 its associated with - currently on page 508. Now the table is way down on page 548 at the very end of the chapter! I am building the mpi-4.x branch at sha ef6c0e10…View Comment
jeremylt open issue freeCodeCamp/freeCodeCamp#58185.
Typo in Sentence Analyzer
### Describe the Issue…View Comment
JustinPrivitera closed issue LLNL/conduit#1242.
add species I/O support for Silo meshes
tzanio open issue mfem/mfem#4584.
Adding a constraint which is integral of the gridfunction over a subdomain
I want to compute the AC current density distribution in a wire passing 1 ampere (electromagnetic skin effect). To do so I used the equation of magnetic vector potential in 2D (assume the wire is long and straight). The proper equation is solved in example 22 prob 1, -Div(a Grad u) - omega^2 b u + i omega c u = 0. In example 22 there is boundary conditions but in my case I would like to enforce “integral(-i omega sigma u) = 1” so that the current is 1A and get rid of the Dirichlet conditions. …View Comment
garlick open issue flux-framework/flux-core#6549.
idea: synthetic file system mounted on /flux using 9P
Old idea actually, but having recently revisited diod to do some maintenance, I’m reminded again of how simple it is to embed a 9P server in other code. We could embed one in the flux broker (as a broker module) to get ourselves a synthetic file system mounted on /flux
for each Flux instance.
…View Comment
cjekel closed issue cjekel/piecewise_linear_fit_py#120.
publish wheels
I notice that this package publishes only source distributions, so that every user has to build the wheel themselves. Better for package owners to build and publish wheels once and for all. …View Comment
JustinPrivitera open issue LLNL/conduit#1349.
Specset flavor converters
We need specset flavor converters much like we have for matsets….View Comment
adayton1 open issue LLNL/CHAI#294.
The GPU thin mode does not respect allocators
In the single memory space paradigm, there is no pointer record, so the allocators passed to the constructor of ManagedArray are thrown away. Looks like a pointer record needs to be generated in this case….View Comment
tgamblin closed issue hpsfoundation/tac#17.
[Project Proposal] Viskores
# Project Proposal …View Comment
samuelkgutierrez closed issue hpc/quo-vadis#284.
Rename Test Utilities
Avoid the use of qvi_test_*
for test utilities….View Comment
correaa open issue mull-project/mull#1082.
Installation instructions do not work across the board
When I use aarch64
the installations instructions do not work. I assume it is because there is no binary for this architecture. If so, it would be useful to document that in the instructions.
…View Comment
balos1 closed issue LLNL/sundials#634.
Build issue on Windows with SUNDIALS v7.1.1
I try to compile and link cvRoberts_dns.c from https://github.com/LLNL/sundials/blob/main/examples/cvode/serial/cvRoberts_dns.c….View Comment
jeremylt open issue CEED/libCEED#1720.
ROCm CI
We currently don’t have a card on Noether for ROCm CI. We should re-enable ROCm CI when able….View Comment
tzanio closed issue mfem/mfem#4584.
Adding a constraint which is integral of the gridfunction over a subdomain
I want to compute the AC current density distribution in a wire passing 1 ampere (electromagnetic skin effect). To do so I used the equation of magnetic vector potential in 2D (assume the wire is long and straight). The proper equation is solved in example 22 prob 1, -Div(a Grad u) - omega^2 b u + i omega c u = 0. In example 22 there is boundary conditions but in my case I would like to enforce “integral(-i omega sigma u) = 1” so that the current is 1A and get rid of the Dirichlet conditions. …View Comment
alecbcs closed issue spack/spack-gantry#112.
Add centralized config
## Problem/Opportunity Statement …View Comment
BradWhitlock open issue LLNL/axom#1481.
axom::primal::clip() for polygons is unreliable.
I was trying to clip 2 polygons (a quad and a triangle) to determine their overlap and the clip() routine sometimes makes invalid polygons, which can lead to wrong results or a crash….View Comment
balos1 closed issue sundials-codes/sundials-download-tracker#1.
2024 Partial Data Loss
For documentation purposes, I am writing this issue to note that clone data was not captured between 2024-09-12 to 2024-12-18. As such, the 2024 total count of clones is an underestimate.
…View Comment
mej open issue mej/nhc#155.
Doc & Example for External Match Syntax
NHC’s support for Match Strings is a powerful, extensible mechanism for determining exactly which checks will run on which hosts and under what conditions. The most obvious and prevalent way they are used is in the config file, specifying criteria for NHC to use in deciding which checks it should run. While the 3 built-in mstr types (glob, regex, range) are usually sufficient for making that decision based solely on hostname/nodename, there are use cases for using other criteria, like nodegroups, Slurm partitions or states, etc., a few of which exist at Los Alamos. …View Comment
garlick open issue flux-framework/flux-core#6529.
idea: follower brokers actively track lead broker configuration
Problem: currently we assume the same TOML configuration files are synced across the cluster, but in practice sometimes this is inconvenient. …View Comment
bwpriest open issue LLNL/saltatlas#84.
dhnsw::detail::query_engine is incompatible with recent changes to cereal::YGMInputArchive
Saltatlas DHNSW executables can no longer compile due to the following error:…View Comment
IanLee1521 open issue elastic/kibana#204685.
[Discover] Open new case from Discover
vsoch open issue danbarr/hcp-packer-azure#1.
Build without HPC registry?
Hi! I’d like to test this out, but I don’t want to have to bring in another service, aside from using Azure. Do you have an example for that? Thanks!…View Comment
cyrush open issue Alpine-DAV/ascent#1431.
add actions that control logging options
Actions that control logging to give users more control than just params passed via ascent::open
…View Comment
markcmiller86 open issue RenderKit/ospray#604.
Please publish shasum -a 256 checksums for your downloadable assets both here and on ospray.org
Would it be possible for you to please publish checksums for your downloadable assets and do so in multiple places (any emails announcing releases, as part of the GitHub release assets and on ospray.org web pages). …View Comment
BradWhitlock open issue visit-dav/visit#20114.
Pseudocolor plot of mixed-material variable stored in Blueprint is SLOW.
### Describe the bug …View Comment
cyrush open issue visit-dav/visit#20110.
request for center of gravity query
BradWhitlock open issue LLNL/conduit#1336.
Improve sphinx documentation for fields.
The field documentation is not adequate for mixed-material fields….View Comment
JustinPrivitera closed issue visit-dav/visit#18736.
Enhance Displace operator to allow specification of reference state.
Enhance the Displace operator to allow specification of the reference state. …View Comment
jameshcorbett open issue flux-framework/flux-coral2#243.
Preset DW strings for copy_in and copy_out
Preset DW strings can now be set in a config file, but the original use case was for #DW jobdw
strings. I was thinking that it might be nice to do preset DW strings that do copy in and out.
…View Comment
jeremylt closed issue CEED/libCEED#1700.
Get* Interface Uniformity
Ok, we’re pretty inconsistent about when a Get
interface needs a Restore
/Destroy
. Here’s the list of ones that don’t.
…View Comment
balos1 closed issue LLNL/sundials#594.
Question regarding choice of Newton solver in IDA
IDA provides a default Newton method (Modified Newton with no globalization) for solving the nonlinear algebraic system at every timestep. However, IDA also implements a Newton+Linesearch algorithm which is (as far as I can tell) only used from within IDACalcIC, and cannot be selected as the nonlinear solver for the per-timestep systems. …View Comment
cmoussa1 open issue flux-framework/flux-accounting#543.
config: is there a need for flux-accounting to consistently updated with general flux config?
We had a pretty productive conversation yesterday in a coffee hour talking about a potential need for flux-accounting to maintain consistency with configuration options set in Flux config files, particularly for queues. While the use case for flux-accounting is specific (mostly just to limit access to queues on a per-user basis), it still requires management of which queues are set up on a cluster, and this management potentially comes from two different places (a configuration file for fluxion, and a table in a SQLite database for flux-accounting). …View Comment
v-dobrev closed issue xsdk-project/xsdk-issues#267.
mfem+hip~sundials defaults to c++11 - with build errors (about c++14).
``` …View Comment
cyrush open issue visit-dav/visit#20075.
VTK 9.4 was tagged 2024/11/22
### Is your feature request related to a problem? …View Comment
cmoussa1 open issue flux-framework/flux-accounting#540.
plugin: read priority configuration information from TOML files
The priority plugin already reads basic configuration properties for general priority factor weights, but not priority attributes for queues specifically, e.g their associated priorities. The plugin should be expanded to support reading attribute properties about queues (such as priorities) from a config file….View Comment
cmoussa1 open issue LLNL/cmvl#2.
repo: ok to add Flux job logging script here?
Is this repo a good spot for me to add my custom scripts for generating the Flux .ndjson
files? Let me know. Perhaps I can create a flux
folder where I can stash the Python scripts for querying inactive jobs and generating the logs….View Comment
cyrush closed issue Alpine-DAV/ascent#1391.
The built-in cinema viewer page: cinema_databases/cinema_out/index.html appears disorganized and unresponsive
I successfully generated a cinema database following the WarpX Ascent In-Situ Tutorial, using WarpX version 24.09, instrumented with Ascent version 0.9.3, built using Spack. I modified the Spack recipe for AMReX to include Ascent and Conduit as variants. …View Comment
JustinPrivitera closed issue visit-dav/visit#19873.
Builds and host profiles for El Cap Early Access Systems
Presumably this is important for 3.4.2…View Comment
correaa open issue fmtlib/fmt#4240.
Extend the interface of join to pass nested separator down
With fmt, one can pass a range and a separator to fmt::join
.
…View Comment
vsoch open issue converged-computing/lammps-time#3.
Weekend (early) experiment plans
I’m interested to see how this looks across a more reasonably sized run, which doesn’t need to be “HPC level” but I want to last a few minutes. I want to see the degree to which patterns vary across a set of nodes (all running LAMMPS, for example) and I’m also interested in the following ideas / use cases: …View Comment
vsoch open issue converged-computing/performance-study#73.
OSU Issue
- [ ] Azure AKS missing 128 32 and 256 in plots - we have the data so must be the global function to filter is filtering it out. …View Comment
alecbcs closed issue spack/spack-tutorial#321.
documentation console script are not readable
The console log of the documentation does not provide the good information. Instead there is the path of the needed files. …View Comment
adayton1 open issue sandialabs/verdict#4.
GPU support
Are there any plans for adding GPU support to verdict? My GPU accelerated code is hitting a big slowdown when I have to switch to host only execution for some verdict evaluations. It looks to me as if porting verdict would mostly involve adding __host__ __device__
specifiers to functions and switching from std:: math functions to the corresponding c versions….View Comment
tomstitt closed issue LLNL/Caliper#622.
Hardcoded C++ Standard
Hi, I noticed that Caliper is hardcoded to build with c++11: …View Comment
cyrush open issue visit-dav/visit#20036.
3.4.2 releas
Tracking Issue for all 3.4.2 release builds done by LLNL team. …View Comment
jeremylt closed issue CEED/libCEED#1704.
Rust from_raw should be unsafe
Need to make Operator::from_raw()
and similar unsafe
!
…View Comment
trws closed issue flux-framework/flux-sched#1309.
partial cancel: transition partial cancel final error to warning
hppritcha open issue open-mpi/ompi#12909.
UCX not happy with sessions process model
Some of the sessions tests in the ompi-tests/ibm test suite fail when using the UCX PML. …View Comment
markcmiller86 open issue visit-dav/visit#20006.
Add fuzz and scanning logic for checking failed Azure CI tests
We should add args to the test suite run in Azure which permit some diffs in the outputs of tests but still be considered a pass. Once we have that working, we should add pass/fail status to test suite run for Azure CI. …View Comment
correaa open issue fmtlib/fmt#4226.
Extend tuple features (e.g. join) to tuple-like objects
For some technical reasons I had to replicate the functionality of std::tuple in my library. (see here: https://github.com/correaa/boost-multi?tab=readme-ov-file#formatting-fmt-pretty-printing) …View Comment
vsoch open issue compose-spec/compose-go#707.
Example to read config file from string?
This is likely a more rare use case, but I want to use go embed to include a string-ified compose file (that loads into a Project) and I’m wondering if you have an example to do that? I’m looking here: https://github.com/compose-spec/compose-go/blob/b9d3b1deec9bc25de0fa81867ca4df75aef5c791/cli/options.go and I see that ReadConfigFiles can accept a path or stdin, but not a loaded string. I can try to dig deeper and see if I can come up with something, or fall back to reading from file for now, but I’m hoping someone has come up with something. Thanks!…View Comment
JustinPrivitera closed issue visit-dav/visit#19876.
Blueprint MFEM New LOR breaks nodal vs zonal variables
Old LOR of bp_mfem_ex9_periodic_hexagon_000000.root solution variable: …View Comment
tgamblin closed issue spack/spack#47347.
Propagation doesn't stand a pickle round trip
### Steps to reproduce…View Comment
garlick open issue flux-framework/flux-core#6409.
job-manager: don't use flux imp kill to signal housekeeping, perilog
Problem: flux imp kill
will be deprecated but the job manager uses it to terminate housekeeping and prolog/epilog tasks.
…View Comment
cyrush open issue LLNL/conduit#1331.
auto detect files with `yml` extensions as a `yaml`
adayton1 open issue LLNL/Umpire#921.
Extract code for merging memory chunks into separate function
In QuickPool::deallocate (and the pending ResourceAwarePool::do_deallocate method), there is code to merge memory chunks that is repeated. Consider extracting into a separate function….View Comment
BradWhitlock closed issue LLNL/axom#1457.
CLI11 warnings on BlueOS+CUDA.
These warnings were reported by the Ares team and they would like them to be corrected. It looks like they are building on BlueOS with CUDA enabled. …View Comment
markcmiller86 closed issue visit-dav/visit#19501.
Silo support for PeriodicDomList and PeriodicBndList
### Describe the bug …View Comment
markcmiller86 open issue LLNL/Silo#426.
Investigate performance of Silo's browser reading large DBmultimeshadj objects
I have some data on RZ from a large, multiblock mesh. The DBmutlimeshadj
objects in the file are large. There is a very long delay when printing them in browser. Is the problem in browser or in the silo library?…View Comment
BradWhitlock open issue LLNL/axom#1457.
CLI11 warnings on BlueOS+CUDA.
These warnings were reported by the Ares team and they would like them to be corrected. It looks like they are building on BlueOS with CUDA enabled. …View Comment
tomstitt open issue LLNL/RAJA#1756.
Runtime memory policy for multireducers
Is your feature request related to a problem? Please describe. …View Comment
vsoch open issue converged-computing/ensemble-python#10.
idea: ensemble members separate
Right now, we are submitting isolated ensembles, meaning single yaml “ensemble.yaml” files. …View Comment
JustinPrivitera open issue visit-dav/visit#19920.
Onion Peel Domain Selection should be a typeable field instead of a drop down menu
A user on teams suggested our onion peel operator ought to let you type the domain number instead of using a drop down list to choose domain. …View Comment
markcmiller86 open issue visit-dav/visit#19917.
Misleading save of windowLayout in GlobalAttributes section of config
### Describe the bug …View Comment
jedbrown closed issue rsmpi/rsmpi#189.
Derive implementation of `Equivalence` hard codes `mpi` path
The Equivalence
derive macro expects mpi
to be an explicit dependency of the crate using the derive macro by hard coding the mpi
dependency in the paths of the types used in the proc macro, e.g. ::mpi::some::field
. An error is returned if mpi
is provided some other way, e.g. as a re-export in a dependency.
…View Comment
grondo open issue flux-framework/flux-security#185.
imp: implement signal forwarding specification from RFC 15
The RFC 15 signal handling section now specifies: …View Comment
jeremylt closed issue Eudicods/outworlds-wastes#125.
Update CONTRIBUTING.md file
The file regarding contributing to the repository is not up to date. It lacks guidelines on raising PRs, PR titles, and comments to add while raising an issue or getting an issue assigned to oneself.
…View Comment
correaa open issue ROCm/rocThrust#475.
How is rocThrust synchronized with the Nvidia version of Thrust?
that is the question….View Comment
artv3 closed issue LLNL/RAJA#1691.
Improve RAJA layout documentation when using unit stride
When using Views and Layouts it is critical for performance to specify unit stride. Per user request, It would help users if we improve documentation to highlight this. One option is to enable unit striding by default. …View Comment
adayton1 closed issue LLNL/RAJA#1744.
MSVC compilation failure w/ 64-bit intrinsics in x86 configurations
After axom
updated its raja
dependency on Windows from raja@2024.02.0
to raja@2024.07.0
, our 32-bit x86 CI started to fail
…View Comment
jameshcorbett open issue flux-framework/flux-coral2#230.
Filtering for rabbit jobs
@behlendorf requested some sort of utility that would return a list of jobids which ran on specific rabbits over a given time. …View Comment
cmoussa1 closed issue flux-framework/flux-accounting#503.
flux-accounting guide: missing definition(s) for terms
Was perusing the flux-accounting guide this morning and noticed at least one definition missing for fair share. This guide should be stepped through and updated with any other terms that would benefit from having its definition written at the bottom of the guide….View Comment
jeremylt closed issue Eudicods/outworlds-wastes#107.
Battlefield Support Asset Conversion
Battlefield Support Asset Skills of 6 and 5 equate to ‘Skill 4’ and ‘Skill 3’ for standard values as BSP Assets are always counted as moving at a run. So their skill calculation includes their ‘Always running or jumping’. …View Comment
trws closed issue flux-framework/flux-sched#1140.
build: `make install` doesn't honor the install prefix in all cases
When running on LC, I’m configuring the installation to my home directory, but it’s still trying to do stuff (not sure the word really) to /usr/lib
:
…View Comment
vsoch closed issue flux-framework/flux-k8s#80.
Segfault in PreFilter
The current CI tests are failing because: …View Comment
wihobbs open issue flux-framework/flux-core#6361.
shell: need plugin to disable ASLR
With ASLR enabled by default on most Linux systems, some users may want a way to disable it for the purposes of debugging parallel programs. See personality(2)
…seems like a pretty simple shell option we could add….View Comment
IanLee1521 open issue elastic/integrations#11388.
[Elasticsearch]: Cluster Ingest dashboards don't work
### Integration Name…View Comment
artv3 open issue LLNL/RAJA#1751.
Improve permuted layout construction
Is your feature request related to a problem? Please describe. …View Comment
bwpriest open issue LLNL/saltatlas#81.
metric_hyperplane_partitioner is not compatible with key-value containers
Currently in v0.3-dev
, if you pass a key-value store to saltatlas::dhnsw::partition_data()
, such as a ygm::container::map
, the for_all()
s in saltatlas::metric_hyperplane_partitioner
barf because they expect a bag of pairs. The offending lines occur here, here, here, and here.
…View Comment
benzwick open issue lorifranke/SlicerTMS#17.
Cannot download large files from GitHub
I tried to clone the SlicerTMS repo but I get the following error. It looks like there is a limit on the data that can be downloaded from GitHub. …View Comment
BradWhitlock open issue LLNL/axom#1440.
Array crash using RelWithDebInfo with CUDA.
I think I was able to make a reproducer for an axom::Array crash. Build the following example program on rzansel. It will build in RelWithDebInfo mode and will crash in the axom::Array constructor trying to default-initialize memory with placement new on the host (but the memory should be on the GPU). …View Comment
grondo open issue flux-framework/flux-sched#1308.
fluxion specific `libintern.so` should have package-specific prefix and version suffix
Fluxion currently installs the string interner library as /usr/lib64/libintern.so
. This has a larger than zero probability of conflicting with some other package, so as suggested by @jameshcorbett it should probably get a fluxion
somewhere in there like libfluxion-intern.so
.
…View Comment
cyrush open issue Alpine-DAV/ascent#1394.
implicit func box slice yields rounded corners
We should ask VTK-m folks if this is expected: …View Comment
markcmiller86 closed issue visit-dav/visit#18881.
Query on empty plot crashes viewer in 3.3 but not 3.2
### Describe the bug …View Comment
grondo open issue flux-framework/flux-core#6349.
perilog: consider raising an exception when an epilog fails
Problem: The perilog plugin currently does not raise an exception when the epilog fails, as documented in this comment: …View Comment
tgamblin closed issue hpsfoundation/tac#4.
Lifecycle Policy: Template Update for Extra Questions (Core)
Proposal to fine-tune the questions in the new project template: https://github.com/hpsfoundation/tac/blob/main/.github/ISSUE_TEMPLATE/new-project-proposal.md …View Comment
garlick open issue flux-framework/flux-core#6345.
flux overlay/module trace should optionally show payloads
Problem: flux overlay trace
and flux module trace
would be more useful in some situations if they could show the payloads being sent with each message….View Comment
vsoch open issue flux-framework/flux-operator#235.
External API and client
If we want to present this as a full cluster, it would be interesting to try creating a grpc external API service (written in GO) that would allow external submission, and depending on the audience we are targeting, does not require users to rewrite their jobs. I need to think more about that last piece….View Comment
jeremylt open issue Eudicods/outworlds-wastes#110.
Scenario Clarification
Assassination should indicate some time limit …View Comment
grondo open issue flux-framework/flux-core#6323.
perilog: rethink sending SIGKILL after SIGTERM when canceling prolog/epilog
When the perilog plugin cancels the prolog/epilog, it first sends SIGTERM followed by a SIGKILL after a configurable number of seconds. For this use case perhaps the SIGKILL is not necessary, since the administrative prolog/epilog should behave when sent SIGTERM, and begin its termination process (which should be given a longer grace period than the default of 5s) …View Comment
astrophpeter open issue LLNL/popclass#61.
add check that number of samples is > number of parameters to catch shape errors.
whenever a samples array is passed into a function check that number of samples > number of parameters. Throw value error if false. Write tests to test this functionality. …View Comment
JustinPrivitera closed issue visit-dav/visit#17922.
bug with material id processing in the blueprint plugin
### Describe the bug …View Comment
IanLee1521 open issue elastic/kibana#194240.
Re-size widget to fit large metric values in ES|QL
Kibana version: …View Comment
jameshcorbett open issue flux-framework/flux-coral2#223.
Add rabbit entries to `flux job info` list of useful keys
flux-coral2 software adds a number of entries to jobs’ KVS (see https://flux-framework.readthedocs.io/en/latest/tutorials/lab/rabbit.html#additional-attributes-of-rabbit-jobs). @behlendorf noted: …View Comment
IanLee1521 open issue elastic/integrations#11218.
[GitLab] Add support for other GitLab logs
Hi there!…View Comment
IanLee1521 open issue LLNL/elastic-stacker#21.
Add support / handling of deleted objects
When objects are deleted in Elastic, we should update the local cached content as well (maybe as an option that can be disabled) to know when things are removed on the server side….View Comment
tpatki open issue flux-framework/flux-multi-cluster-utilities#5.
Test across multiple clusters, document findings
markcmiller86 open issue LLNL/Silo#421.
Silo cannot utilize externally installed HDF5 compression filters
Currently, the only way to compress datasets with Silo is for the data producer to use one of Silo’s pre-installed filters. There is no way for a data producer to use an arbitrary custom compression filter. …View Comment
JustinPrivitera open issue Alpine-DAV/ascent#1386.
RZ mesh support
Need to add RZ mesh support to Ascent. A user reported the following: …View Comment
grondo open issue flux-framework/flux-core#6294.
apparent hello storm after scheduler reloaded
In the logs captured below, the scheduler was apparently loadedat 10:58, then we see the messages: …View Comment
jedbrown open issue RDycore/RDycore#219.
Single-communication RHS evaluation
## Current state …View Comment
grondo open issue flux-framework/flux-sched#1298.
fluxion crash in 10% of user runs on elcap
On elcap a user is running a series of jobs that appear to cause a Fluxion crash roughly 10% of the time. We have one corefile (in my homedir under fluxion-crash
. I believe this is still occurring for one or two users, so we’ll want to determine how to get a patch for this specific issue to apply to the current flux-sched RPM asap.
…View Comment
cyrush open issue LLNL/conduit#1324.
see if we should update libver logic for hdf5 to support newer versions
https://github.com/LLNL/conduit/blob/fa0f8d3738b0c4cef95a0d05003f3b80aa381f23/src/libs/relay/conduit_relay_io_hdf5.cpp#L3192…View Comment
astrophpeter closed issue LLNL/popclass#57.
make package conda installable
JustinPrivitera open issue visit-dav/visit#19815.
Upgrade ADIOS2 reader to use VTK::IOADIOS2
See #19796 for more context. …View Comment
astrophpeter open issue LLNL/popclass#57.
make package conda installable
samuelkgutierrez closed issue hpc/quo-vadis#274.
test-pthread-split example fails
Currently, the test program test-pthread-split
fails:
…View Comment
tpatki open issue flux-framework/flux-power-monitor#10.
Add Contributing.md and Code_of_conduct.md
jameshcorbett open issue flux-framework/flux-coral2#210.
Only delete workflows after a configurable timeout
@bdevcich asked, …View Comment
cyrush closed issue visit-dav/visit#19026.
avtGhostNodes Array not being extended properly
When working on #17950 for VisIt 3.4, we discovered that the avtGhostNodes array was not being extended properly: …View Comment
dmed256 closed issue dmed256/aibo#3.
Add toggle-button to packages
## Description …View Comment
astrophpeter closed issue LLNL/popclass#27.
Test pip install and package build on pull request
chu11 closed issue chaos/whatsup#20.
pingd flaky with >100 or so nodes
I’ve been playing around with whatsup-pingd for use with pdsh -v. It mostly operates well as long as there are less than 100 or so nodes in /etc/genders. …View Comment
cyrush open issue LLNL/conduit#1323.
fortran vs mpi logic blt cmake logic
see: …View Comment
cyrush open issue LLNL/blt#697.
fortran vs mpi logic
Use case: …View Comment
amagela closed issue lanl/benchmarks#109.
MiniEM build YAML clarification
The documentation is missing a step w.r.t. copying template.yaml
into spack.yaml
. This will be added. …View Comment
artv3 closed issue LLNL/raja-suite-tutorial#27.
macOS build issues
``` …View Comment
chu11 open issue flux-framework/flux-core#6251.
content: require backing store for checkpoint put
As noted in #6242, there appears to be an inconsitency in the content module …View Comment
hppritcha open issue open-mpi/prrte#17.
Merge PRRTE MCA namespace into the OMPI one
A follow-on to issue https://github.com/open-mpi/prrte/issues/16. Once PRRTE source code is moved out of 3rd-party to a top level folder, it may be worth seeing what would be involved in flattening the MCA space so that the PRRTE project MCA namespace would be subsumed into the OMPI MCA name space. …View Comment
IanLee1521 open issue elastic/integrations#10929.
GitLab Integration has inconsistent option menus
I was starting to test out the beta GitLab integration (v0.3.0) and noticed that the only the “GitLab API logs” and “GitLab Production Logs” expose their “Paths” options in the integration settings normally. …View Comment
cyrush closed issue Alpine-DAV/ascent#1297.
surfaces are rendered in parallel that should not be visible
this image is a render of a field with a camera positioned above a wing, where the surface topology is defined by tris in 3D space. geometric entities on the lower side that shouldn’t be visible are seen in the render – the flap farings, and the aft part of the engine nacelle. this was done on 28 ranks: …View Comment
agcapps open issue Alpine-DAV/ascent#1375.
If ascent_actions.yaml exists, it overrides what the user passes to Ascent::execute()
Ascent::execute()
lets the user pass in a Node with actions:
…View Comment
aschaf open issue mfem/mfem#4479.
Create `ParMesh` in cascading way
Hi, …View Comment
BradWhitlock closed issue visit-dav/visit#19780.
Blueprint reader not using offsets for unstructured topologies.
I had a mesh where I left some extra unused slots in the elements/connectivity array. I turned some quads into triangles. The elements/sizes were set to the right length and the elements/offsets were valid for seeking to each element. VisIt did not plot this correctly since the Blueprint code was not using elements/offsets to seek through the zones in the mesh (it was using sizes). …View Comment
garlick open issue worlickwerx/pi-parport#67.
consider the SMT version of the 40 pin GPIO header
Problem: the raspi cooler is almost required for the pi5 which necessitates mounting a HAT+ with 15mm (16mm recommended) clearance per the HAT+ spec. …View Comment
vsoch open issue converged-computing/performance-study#7.
ToDo Before End of Month
- [ ] Debug size 256 cluster on AWS EKS - likely cni issue …View Comment
jameshcorbett open issue flux-framework/flux-coral2#201.
Optionally dropping DW_JOB_ prefix from environment variables
I was talking with @behlendorf and @mcfadden8 and the consensus was that LC users would prefer to have their environment variables be just NAME instead of DW_JOB_{NAME}
. Presumably the same goes for DW_PERSISTENT_{NAME}
. …View Comment
jameshcorbett open issue NearNodeFlash/NearNodeFlash.github.io#199.
Optionally dropping `DW_JOB_` prefix from environment variables
I was talking with @behlendorf and @mcfadden8 and the consensus was that LC users would prefer to have their environment variables be just NAME
instead of DW_JOB_{NAME}
. Presumably the same goes for DW_PERSISTENT_{NAME}
. Could this be enabled with some additional configuration? …View Comment
cyrush open issue Alpine-DAV/ascent#1368.
legacy vtk extract issues
- check if fields are written as well as the mesh data …View Comment
vsoch open issue flux-framework/Tutorials#44.
Add tutorial survey link to main survey
To make it easy to find / click, we should have the survey linked directly at the end of the tutorial. This is a form that I can add right before we do the survey. It is also in the slides….View Comment
correaa open issue correaa/boost-multi#69.
.stride function (from layout) shouldn't compile
```cpp …View Comment
JustinPrivitera open issue LLNL/conduit#1320.
Make the node schema pointer getter const
https://github.com/LLNL/conduit/blob/fa0f8d3738b0c4cef95a0d05003f3b80aa381f23/src/libs/conduit/conduit_node.hpp#L3777 …View Comment
hppritcha open issue open-mpi/prrte#16.
move PRRTE from 3rd-party/prrte to a top level folder in the Open MPI repo
A follow-on to issue #13 - move PRRTE source code out of 3rd-party/prrte and in to a top level prrte folder. This would involve significant changes to the top-level autogen.pl in the OMPI repo. A good way to assess these changes is to examine the autogen.pl when an older branch of OMPI is checked out, for instance v4.1.x. This move would also require changes to Makefiles, and possible moving of the mca folder location in PRRTE to be more that found in the ompi/ompi, ompi/opal, and ompi/oshmem folders….View Comment
hppritcha open issue open-mpi/ompi#12773.
m4 cleanup on main since we will not be supporting external PRRTE
The m4 code in ompi_setup_prrte.m4 can be considerably simplified for the main
(and soon to be 6.0.x) branches since we will no longer support an external PRRTE option.
…View Comment
eliasboegel closed issue lilyinstarlight/nixos-cosmic#274.
Method to exclude default COSMIC apps
The KDE packaging for NixOS includes a way to exclude Plasma applications from being installed: https://nixos.wiki/wiki/KDE …View Comment
JustinPrivitera closed issue visit-dav/visit#19737.
8/13 test suite failures to review
### Describe the bug …View Comment
vladotomov closed issue mfem/mfem#4406.
Multiscale: An FE domain solve for every integration point
Hello, …View Comment
chu11 closed issue chaos/genders#62.
nodeattr --compress has 1024 character output limit
HOSTLIST_BUFLEN
in nodeattr.c is used in _hash_hostrange
such that the output size from nodeattr -f <file> --compress
can only be <= 1024 characters:
…View Comment
agcapps closed issue LLNL/axom#1167.
Memory error in the in/out query
User reports an Umpire error message saying “Cannot find allocator for pointer:
grondo open issue flux-framework/rfc#423.
RFC 27: consider adding flags to alloc request
In flux-framework/flux-core#5739 there is a proposal to add a flag to preemptible/standby jobs. A scheduler can then use this flag to determine if a job may be canceled to make way for a higher priority job. However, currently flags are not shared with the scheduler, which would have to read the submit
event to get the flags.
…View Comment
JustinPrivitera open issue LLNL/conduit#1318.
GLTF export
Add GLTF to relay mesh io…View Comment
jedbrown open issue Tehforsch/diman#78.
maintained version of hdf5
The hdf5 lib here has become unmaintained and doesn’t work with recent 1.14.x releases. We might want to switch to the maintained fork….View Comment
cyrush open issue Alpine-DAV/ascent#1360.
add a reflect filter
analogous to VisIt’s reflect operator, with the same quadrant options, and the options to keep / remove the original. …View Comment
tpatki open issue ovis-hpc/ovis#1431.
Update Variorum sampler to support v0.8
Variorum v0.8 has updated the JSON API for obtaining vendor-neutral power telemetry. …View Comment
tpatki open issue flux-framework/flux-power-mgr#6.
Rename to flux-power-monitor and flux-power-manager
Rename module names to flux-power-monitor and flux-power-manager. …View Comment
artv3 open issue LLNL/raja-suite-tutorial#30.
Example 04
Is missing N in the non-solution. …View Comment
tzanio closed issue mfem/web#271.
Documentation on DG boundary conditions is misleading (for 1D at least)
The documentation on boundary conditions reads, “In the Discontinuous Galerkin (DG) formulation the Natural, Neumann, and Robin can be implemented in precisely the same manner as in the continuous case.” (see https://github.com/mfem/web/blob/12882dfb42bf16fa181a0da7b9c1c92abbb57642/src/fem_bc.md?plain=1#L217) …View Comment
tzanio closed issue mfem/mfem#4398.
Inconsistency Between MFEM + ParaView for Symmetric Tensor Outputs
For outputting a QuadratureFunction
with vdim = height*(height+1)/2 (where height is the physical dimension of the problem), as would be used in particular for outputting a symmetric tensor, to ParaView using ParaViewDataCollection
, MFEM orders components from the symmetric matrix $T_{ij}$ as:
…View Comment
garlick open issue flux-framework/flux-core#6211.
not ok 17 - attach: reports job shell Killed if job shell is killed
Problem: this test frequently fails when ‘make check’ is run with a high -j value. …View Comment
jameshcorbett open issue flux-framework/flux-coral2#197.
DWS: handle broken kubernetes connection
An error occurred on elcap: …View Comment
cyrush open issue visit-dav/visit#19731.
opacity for wireframe and point rendering modes
### Is your feature request related to a problem? …View Comment
grondo open issue flux-framework/flux-coral2#195.
`dws_environment` shell plugin should cancel eventlog watch
The dws_environment
job shell plugin watches the job eventlog until the dws_environment
event is found:
…View Comment
vsoch closed issue converged-computing/fluxnetes#4.
Discussion: do a reservation instead
I think instead of MatchAllocate
we should do MatchAllocateElseReserve
and then we don’t need to run the job again - the work will be scheduled for a future point. I haven’t ever used this function, but I’m assuming since a single schedule cycle updates these reservations, we likely need to schedule this to run again at a time in the future (the time of the reservation) and run MatchAllocateElseReserve
again (and keep going until it’s allocated) or use a separate queue.
…View Comment
vsoch closed issue converged-computing/container-chonks#1.
Next things for docker container chonker experiments
- [x] look at number of layers over time …View Comment
cyrush closed issue LLNL/conduit#1314.
data type equals does not take into account the stride
```c++ …View Comment
vladotomov closed issue mfem/mfem#4429.
Boundary Dof (id) and physical location
Hello, …View Comment
tpatki open issue LLNL/variorum#576.
Sample at 50ms in the IBM Power9 Energy APIs
Currently, we’re sampling every 250ms – as a result, when we encounter a code that doesn’t do much work and takes less than 250ms, energy deltas get reported as zero. …View Comment
trws open issue flux-framework/flux-core#6184.
t2406-job-exec-cleanup Racy sub-test
Spurious test failure due to job not getting killed, or so it seems. Get a null for kill_timeout. …View Comment
tpatki open issue LLNL/variorum#575.
Support a choice between delta from the first call versus previous call in `variorum_get_energy_json`
Update the API to support nested calls in general, especially in Caliper-like tools. …View Comment
cyrush open issue visit-dav/visit#19720.
allow copying of auto generated expressions defs in the exprs window
### Describe the bug …View Comment
tgamblin open issue bssw-tutorial/bssw-tutorial.github.io#120.
tgamblin for 2024-08-02-atpesc: Update resource links by 2024-07-29
- [x] “Update https://github.com/bssw-tutorial/bssw-tutorial.github.io/tree/main/2024-08-02-atpesc/presentation-resources/spack.md by 2024-07-29”…View Comment
jeremylt closed issue CEED/libCEED#1640.
Bug in diagonal assembly AtPoints with multi active fields
The diagonal assembly at points has a bug. You can run this mpm mixed linear problem in ratel with -pc_type svd
or using field split but with -pc_type jacobi
it won’t converge….View Comment
wihobbs open issue flux-framework/flux-core#6165.
Usage of `/var/tmp/username`
Nick Sly gave a presentation to LC this morning about building containers for TOSS 4 and TOSS 5. In this presentation, he mentioned that his development on these containers for Flux was made inconvenient by the fact that Flux requires /var/tmp/username
to be mounted from the host of the container, which isn’t done by default.
…View Comment
wihobbs open issue flux-framework/flux-accounting#472.
`flux view-account --list-all-banks-in-json-please`
Steph was asking (for myLC/Lorenz) if there’s a command in accounting that can be used to list all of the banks (excluding the expired ones) on a host. I assume that if there isn’t, this is probably pretty low-hanging fruit…? …View Comment
artv3 open issue LLNL/RAJA#1711.
hip/cuda_global_thread_{x,y,z} not supported in RAJA::kernel
We get the following error when trying: …View Comment
markcmiller86 closed issue visit-dav/visit#19622.
Add modulus operator to expression language
This is already here as a function. Should we add it as an operator?…View Comment
vsoch open issue riverqueue/river#471.
workers not running
Hi! I have a setup that I’m trying to get working here and I’m trying to do a basic job Insert, and then verify that the worker is running. I’ve tried everything I can think of to do, and although I see that the insert returns a jobid (and reports adding to the queue): …View Comment
vsoch open issue flux-framework/flux-docs#271.
Tutorial container and instruction
We got good feedback today that it would be nice to have our Flux Tutorial more easily available on demand, and I agree. For this, after we finish for RADIUSS I’ll build a final variant of the “spawn” image and add some page here that more officially shows how to run it….View Comment
wihobbs open issue flux-framework/flux-core#6146.
t2406 and t2900 fail in GitLab CI, possible matching problem?
Sort of a head scratcher, for a few days now, t2406 job-exec: kill-timeout > original value has been failing: …View Comment
tpatki open issue flux-framework/PerfFlowAspect#165.
Add Fortran support with LLVM pass.
JustinPrivitera open issue LLNL/conduit#1309.
save mesh needs to put relative path in root file
https://github.com/visit-dav/visit/issues/18790…View Comment
markcmiller86 open issue LLNL/Silo#382.
SIZEOF_OFF64_T checks maybe need to go away
At one time, we had quite a mix of 32 and 64 bit systems. Then, we needed to take care that when Silo was dealing with things like file sizes or file offests, the data types being used were 64 bit. …View Comment
jameshcorbett open issue flux-framework/flux-sched#1255.
Condensing JGF
JGF is verbose, and Rabbit-y JGF on elcap systems can become very large. We discussed offline several ways to shrink JGF, both while maintaining the same format and compatibility with the standard (which?) and breaking the standard to achieve greater reductions in size. …View Comment
garlick open issue flux-framework/flux-sched#1254.
make qmanager internal info available via flux module stats
Placeholder for @trws - following up on a face to face discussion about enhancing flux module stats sched-fluxion-qmanager
to give insight into what’s going on, for example to use in tuning queue depth….View Comment
BradWhitlock open issue visit-dav/visit#19691.
Refine MFEM boundary meshes using MultiresControl operator.
I thought I’ve been able to do this before but perhaps it was with Blueprint. The DT team has been trying to visualize boundary meshes for MFEM meshes written via VisItDataCollection and read into VisIt via an mfem_root file. It is possible to plot the boundary mesh but when a MultiresControl operator is applied to it, nothing happens. …View Comment
hppritcha open issue open-mpi/prrte#13.
move prrte out of a submodule and into Open MPI repo
As part of the integration process of this PRRTe fork into Open MPI, we would like to move the prrte source currently in a submodule of the Open MPI source tree out of the submodule and into the Open MPI source proper. …View Comment
jandrej open issue EnzymeAD/Enzyme#2004.
Failure with new c++ sugar
https://fwd.gymni.ch/PE2qjx…View Comment
cyrush closed issue LLNL/conduit#1306.
Compilation failure on Ubuntu 22.04 with OpenMPI on recent commit
The compilation of the development branch seems to fail on Ubuntu 22.04 with OpenMPI after the most recent commits with the following error: …View Comment
hppritcha open issue mpi-forum/mpi-issues#850.
Need to add text in description of MPI_Session_init that it is a local operation
# Problem …View Comment
grondo open issue flux-framework/flux-sched#1251.
housekeeping only drains nodes if systemd unit can be run
The housekeeping service relies on the systemd unit to drain ranks that fail housekeeping. However, if the housekeeping systemd service isn’t configured or fails to start, then the node is not drained. Instead the node is put back into service without housekeeping being run, which could cause any number of failures….View Comment
imguoguo open issue imguoguo/Comments#37.
CamStan open issue LLNL/scr#592.
Create docs for python interface
Anything important to the users can go in https://github.com/LLNL/scr/tree/develop/doc …View Comment
hppritcha open issue open-mpi/prrte#10.
rename all prrte related binaries that will be used by Open MPI
Some of the binaries currently created as part of the PRRTe build will still be needed by Open MPI. However, we don’t want to have names which conflict with an external PRRTe installation. …View Comment
tzanio closed issue mfem/mfem#4344.
HypreBoomerAMG preconditioner memory leaks
Using the HypreBoomerAMG as a preconditioner for the CGSolver causes memory leaks in parallel code (but not in serial code). This can be observed in MFEM’s example codes ex0 and ex0p. This can be significant over a few hundred iterations. Changing the class from HypreBoomerAMG to, for example, a HypreSmoother resolves the issue. …View Comment
alizma closed issue METHODS-Group/DRDMannTurb#97.
Include examples and documentation in the pre-compiled wheels for the package
I think the examples and documentation should be included in the pre-compiled wheels. This would make the package a lot more usable. …View Comment
adrienbernede closed issue LLNL/radiuss-spack-configs#69.
Create specs for poodle
so we can set up GitLab CI for CTS2 and start running CI there….View Comment
markcmiller86 open issue visit-dav/visit#19657.
Do we still need this file
I happened to run across empty.c
in src/engine/main
which contains nothing but this comment…
…View Comment
cyrush closed issue visit-dav/visit#19656.
bug reading uniform blueprint mesh
### Describe the bug …View Comment
tpatki open issue flux-framework/PerfFlowAspect#162.
Upgrade the LLVM Pass to the New Pass manager (Clang-15+)
Related issue is #139. The -flegacy-pass-manager
option does not work after clang-14
, limiting the use of PFA on newer versions of clang. …View Comment
cyrush open issue LLNL/conduit#1295.
use some sort of `self` construct to allow blueprint index to survive renames?
``` …View Comment
artv3 open issue LLNL/RAJA#1691.
Improve RAJA layout documentation when using unit stride
When using Views and Layouts it is critical for performance to specify unit stride. Per user request, It would help users if we improve documentation to highlight this. One option is to enable unit striding by default. …View Comment
CamStan open issue ECP-VeloC/KVTree#73.
KVTree release for SCR v4.0
See SCR v4.0 project: https://github.com/orgs/LLNL/projects/26 …View Comment
vsoch open issue rootless-containers/usernetes#335.
Usernetes with bypass4netns on multi-node
Hi @AkihiroSuda ! :wave: …View Comment
v-dobrev closed issue mfem/mfem#4384.
Matthek triangle method for fast optimization of edges in mechanics and fluid simulation
See matthek triangle method is an extraction of nature forms by trees. …View Comment
vsoch closed issue flux-framework/spack#189.
[package-update] for flux-core: update-package/flux-core-2024-06-05
This is a request to open a pull request for a package update….View Comment
tpatki closed issue flux-framework/PerfFlowAspect#123.
Add object format for trace files (in addition to JSON Array format)
Chrome Tracing Format documentation: https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/preview#heading=h.q8di1j2nawlp …View Comment
artv3 open issue LLNL/raja-suite-tutorial#20.
Provide more description to lesson files
Will help users know what the lesson is ahead of time. …View Comment
adityamnk open issue IBMSpectrumComputing/ray-integration#18.
LSB_AFFINITY_HOSTFILE not set
Hi, I am using LSF on LLNL’s Lassen cluster. …View Comment
jakubcerveny closed issue hbb1/2d-gaussian-splatting#99.
Spurious blobs
I’m testing 2DGS on the ZipNerf dataset (Alameda kitchen shown below), and with default settings I’m getting these blobby surfaces: …View Comment
tpatki open issue LLNL/variorum#565.
Add CPU/Mem energy tracking on IBM print/get_energy_json APIs
cyrush open issue LLNL/conduit#1287.
topology combiner helper
While debugging an issue, I built manually built up several independent topologies and at the end of the process I was yearning for a function to combines them into a single mixed-element topology. …View Comment
eliasboegel open issue CEED/libCEED#1621.
More advanced Rust binding examples
Hi, …View Comment
gauthier12 open issue florianfesti/boxes#688.
Unable to run pre-commit on master
Describe the bug …View Comment
acfisher open issue mfem/mfem#4369.
Runtime disbatch HowTo
#3138 is adding runtime disbatch capability and building it in for certain high traffic integrators. It would be great to writ up a HowTo article to document how you use this disbatch capability and how to apply it to more integrators….View Comment
artv3 open issue LLNL/RAJA#1681.
Add examples with gpu global indexing for both kernel and launch
In a new example it may be useful to add a global indexing example to point users to. There are some for launch but a focused example may be useful for folks browsing through examples. The thought came from this interaction: https://github.com/LLNL/RAJA/issues/1679 …View Comment
jedbrown open issue HarryvL/fcVM-workbench#1.
Update wiki entry for fvFEM?
It sounds from your forum post like fcVM is in some sense a successor to fcFEM. Should this entry be updated or should there be a new entry? …View Comment
gauthier12 open issue florianfesti/boxes#687.
new category for boardgame
Hi, …View Comment
hppritcha open issue hppritcha/prrte#2.
Remove psched tool from Open MPI PRRTe fork
grondo open issue flux-framework/flux-core#6051.
job-manager possibly sends alloc requests after jobs have been canceled
In flux-framework/flux-sched#1222 @trws observed …View Comment
artv3 open issue LLNL/RAJA#1675.
Reduce redundant code in examples
PR https://github.com/LLNL/RAJA/pull/1665 introduces a new format for RAJA examples. Advantages include reduced code duplication while maintaining the ability to run a variety of backends when available. We can draw inspiration from this example to clean up other examples. …View Comment
tpatki closed issue LLNL/variorum#513.
Add rocm 6.0.2 dependency to docs.
tomstitt open issue LLNL/PyDV#277.
`create_plot` returns the `plt` module (instead of a handle to the plot)
I’m not sure if this is intended but create_plot
returns matplotlib.pyplot
(plt
) instead of a plot-specific reference like figure
or axis
, does it make sense to return those instead? I currently just matplotlib.pyplot.gcf()
after calling create_plot
to interact with the reference but that is less ergonomic imo.
…View Comment
JustinPrivitera open issue visit-dav/visit#19605.
Make CSV an entry in the Export Options
We should add CSV to the list of export options. When you select it, it will use Blueprint to CSV to do what you want. …View Comment
tpatki closed issue LLNL/variorum#553.
Newer MI200+ GPUs report GPU-domain-level (GPU socket-level) power data as opposed to individual devices
As an example from our MI200 node, here’s what rocm-smi
(v6.0.2 onwards) reports. Note here that GPU devices 1,3,5,7 do not report power values but report a N/A
instead. This is by design.
…View Comment
jeremylt open issue freeCodeCamp/freeCodeCamp#55194.
Profile GitLab Link
### Is your feature request related to a problem? Please describe….View Comment
jameshcorbett open issue flux-framework/flux-coral2#166.
Stop daemon on compute nodes after jobtap prolog completes
There are clientmount daemons running on every compute node to handle the mounting and unmounting of rabbit file systems. The daemons produce noise, and there have been some investigations lately into how to reduce it. In theory the daemons only need to be running when there are file systems to mount or unmount, at the beginning and end of jobs. …View Comment
mplegendre open issue dyninst/dyninst#1763.
Missing entries in DWARF line information lookup
When looking up line information entries by address, SymtabAPI is failing to return some line entries that should exist. …View Comment
chu11 closed issue LLNL/magpie#343.
fix cornercase / functionality / default tests
It has been a long time since I have tested “default”, “cornercase” and “functionality” tests (I normally just test specific hadoop/spark versions). it appears various test errors have creeped in over the years as little things in newer versions have changed. …View Comment
cyrush open issue Alpine-DAV/ascent#1317.
replay -- parse actions before reading data
best to know if you have malformed actions before you spend effort doing I/O. …View Comment
v-dobrev open issue xiaoyeli/superlu_dist#166.
Potential issue when using PARMETIS column permutation in v8.2.1
I noticed this during testing of mfem-4.7 with SuperLU_DIST v8.2.1 – one of the mfem examples that uses SuperLU_DIST, examples/ex11p.cpp
hangs when run with:
…View Comment
corbett5 closed issue ITensor/ITensorMPOConstruction.jl#17.
Follow style guides for casing of function names and keyword arguments
In Julia, it is standard to use snake casing rather than camel casing, see style guides such as https://docs.julialang.org/en/v1/manual/style-guide and https://github.com/JuliaDiff/BlueStyle….View Comment
jeremylt closed issue Eudicods/outworlds-wastes#15.
Official Release
I would like to release a stable version of the rules around the time that the Mercenaries Kickstarter boxes arrive for people. …View Comment
CamStan open issue ECP-VeloC/spath#30.
spath release for SCR v4.0
See SCR v4.0 project: https://github.com/orgs/LLNL/projects/26 …View Comment
CamStan open issue ECP-VeloC/shuffile#33.
Shuffile release for SCR v4.0
See SCR v4.0 project: https://github.com/orgs/LLNL/projects/26 …View Comment
CamStan open issue ECP-VeloC/redset#56.
Redset release for SCR v4.0
See SCR v4.0 project: https://github.com/orgs/LLNL/projects/26 …View Comment
CamStan open issue ECP-VeloC/rankstr#24.
Rankstr release for SCR v4.0
See SCR v4.0 project: https://github.com/orgs/LLNL/projects/26 …View Comment
CamStan open issue ECP-VeloC/er#42.
ER release for SCR v4.0
See SCR v4.0 project: https://github.com/orgs/LLNL/projects/26 …View Comment
CamStan open issue ECP-VeloC/KVTree#73.
KVTree release for SCR v4.0
See SCR v4.0 project: https://github.com/orgs/LLNL/projects/26 …View Comment
CamStan open issue ECP-VeloC/AXL#144.
AXL release for SCR v4.0
See SCR v4.0 project: https://github.com/orgs/LLNL/projects/26 …View Comment
hppritcha closed issue spack/spack#42651.
Installation issue: OpenMPI@5.0.{1,2} incorrect pmix dependency version range
### Steps to reproduce the issue …View Comment
BradWhitlock closed issue LLNL/axom#1337.
Calling host function in host device function.
The following was reported on Teams. A host function is being called from a host_device function. In this case, the std::numeric_limits<>::max() function is being called and it is host-only. Calls like this could affect Axom on CUDA or HIP. …View Comment
jakubcerveny closed issue jakubcerveny/gilbert#11.
C implementation returns 0, 0 as an index for larger dimensions.
I think it is because of this line. …View Comment
grondo open issue flux-framework/flux-sched#1222.
submitting a large number of jobs and canceling them caused scheduler "hang"
On elcap a user submitted a large number of jobs, each requesting a specific node, then thought better of it and canceled them all. For the past few hours the scheduler has not scheduled any jobs and the “alloc pending” count is stuck at many thousands. …View Comment
markcmiller86 open issue visit-dav/visit#19591.
Cannot parse xdmf file
### Describe the bug …View Comment
vsoch closed issue flux-framework/spack#193.
[package-update] for flux-sched: update-package/flux-sched-2024-06-07
This is a request to open a pull request for a package update….View Comment
grondo open issue flux-framework/flux-core#6038.
flux-start silently ignores `--recovery` when `-s, --test-size` is also present
``` …View Comment
artv3 closed issue LLNL/RAJA#1566.
Add expt::KernelNaming forall example
We are missing an example that exercises the kernel naming capability in RAJA::forall. At the moment I can’t seem to compile the example provided in the RAJA docs (https://raja.readthedocs.io/en/develop/sphinx/user_guide/feature/reduction.html#experimental-reduction-interface). …View Comment
garlick open issue flux-framework/flux-sched#1221.
add ability to query current schedule
Problem: as mentioned in #1219, it would be handy to be able to dump a schedule snapshot, perhaps as json. …View Comment
wihobbs open issue LLNL/maestrowf#444.
feature request: "prettier" error message for missing keys or empty YAML file
Hi folx! I was working with some new-to-HPC users in a tutorial this morning, and one of them observed that when a blank YAML or YAML missing values was passed to maestro run
, you get an entire Python traceback:
…View Comment
cyrush open issue Alpine-DAV/ascent#1310.
compile issue when dray stats are off?
https://github.com/Alpine-DAV/ascent/blob/0158038eb468e5a2e470f63ae271b5297d880ce2/src/libs/dray/utils/appstats.cpp#L264…View Comment
v-dobrev open issue mfem/mfem#4335.
Fix the FMS unit test and the FMS example file `data/star-q3.fms`
The FMS unit test fails as of v4.7. We need a tolerance when comparing the node coordinates. …View Comment
grondo open issue flux-framework/flux-core#6026.
`flux overlay status` is slow on large systems
flux overlay status
is slow on large system instances with a flat TBON. There is potential that it is unnecessarily contacting all rank 0 children serially to produce the default output.
…View Comment
garlick open issue flux-framework/flux-pmix#100.
rhel 9 compilation errors
Problem: rpm build of flux-pmix fails in our rhel 9 build farm with …View Comment
jedbrown open issue FossifyOrg/Messages#180.
Group messages have wrong authors
### Checklist…View Comment
jeremylt open issue CEED/libCEED#1595.
NumElem arg for BasisApplyAtPoints
With the GPU strategy I’m using for AtPoints, I need to modify the BasisApplyAtPoints signature to include a number of elements argument…View Comment
artv3 open issue LLNL/raja-suite-tutorial#19.
Advanced tutorial
- [ ] Example of using gpu loop vs direct policies @kab163 …View Comment
cyrush open issue Alpine-DAV/ascent#1306.
add ability to clear expr + query history
markcmiller86 open issue visit-dav/visit#19565.
This is alarming...Engine and mdserver linked with DB plugin libs!
Go to build dir for engine and do a make clean; make VERBOSE=1 >& junk.out
and then grep junk.out
for hdf5
. You will get hits. But, you should NOT get hits for hdf5. hdf5 is used only in a database plugin lib. If I look at a link of the libengine_ser.dylib
, I get all the items listed below.
…View Comment
garlick open issue flux-framework/flux-test-collective#22.
create environment for flux-core t2410-sdexec-memlimit.t to run
Problem: t2410-sdexec-memlimit.t
is skipped in flux-core’s regular CI.
…View Comment
garlick open issue flux-framework/flux-sched#1210.
investigate unscheduled jobs during cancelation leak
As noted in flux-framework/flux-core#5990, some follow up is needed on why jobs weren’t scheduled when the “alloc leak” occurred on el cap….View Comment
vsoch open issue flux-framework/flux-k8s#71.
Post refactor changes needed
- [x] Update fluence to go 1.20 or 1.21: We are going to hit issues using fluence (go 1.19) with other integrations like rainbow (go 1.20) and on our systems (go 1.20), and after #69 should consider updating. …View Comment
milroy closed issue flux-framework/flux-k8s#71.
Post refactor changes needed
- [ ] Update fluence to go 1.20 or 1.21: We are going to hit issues using fluence (go 1.19) with other integrations like rainbow (go 1.20) and on our systems (go 1.20), and after #69 should consider updating. …View Comment
tpatki open issue LLNL/variorum#546.
Add power capping functionality to Sapphire Rapids.
Add the functionality for variorum_cap_each_socket_power_limit
and variorum_cap_best_effort_node_power_limit
by adding two new low-level functions to the 06_8F
file for our RIKEN collaborators.
…View Comment
artv3 closed issue LLNL/RAJA#1651.
fix SYCL example
When built with SYCL, the dynamic_mat_transpose.cpp
example code fails at run-time. I haven’t investigated, it could be a memory issue. The example also uses the RAJA method setQueue
to set a global SYCL resource. This should be removed and pass a SYCL resource to the launch method directly instead. The RAJA setQueue
method is going away….View Comment
JustinPrivitera closed issue visit-dav/visit#18226.
Upgrade VisIt to the latest and greatest Conduit
Eventually, we must upgrade Conduit to the next version (likely 0.8.5) so we can take advantage of wedge and pyramid support. …View Comment
bhatele closed issue hpcgroup/pipit#102.
Update HPCToolkit reader to access meta.db
Since HPCToolkit version 2022.10.01, we switched to the new database format, which uses meta.db
instead of experiment.xml
from the old database format.
…View Comment
cyrush open issue LLNL/conduit#1283.
error building with numpy 2
Error from conda forge numpy 2 build …View Comment
adayton1 open issue LLNL/axom#1342.
Fast closest_point operator for Quadrilateral and/or Polygon
Does anyone know of a fast algorithm for computing the closest point to a (possibly non-planar and possibly concave) quadrilateral or polygon? I have an application that currently computes the algebraic center of a quadrilateral and then uses that to divide the quadrilateral into 4 triangles. The closest point is then computed for each triangle and we take the one with the minimum squared distance. But this algorithm is expensive (currently accounts for about 15% of the run time)….View Comment
vsoch open issue isala404/kube-ebpf#1.
Updated version?
Hi! I’m trying to test this out (just locally with kind to start) and there is an issue with the init-headers: …View Comment
jakubcerveny closed issue jakubcerveny/gilbert#12.
How to modify it to a curve generation algorithm for local encryption
grondo open issue flux-framework/flux-core#5987.
hostlist: perf issue in `hostlist_find_host()` due to `hostname_create()`
@trws noted in some perf benchmarking of Fluxion that hostlist_find_host()
and specifically hostname_create()
was taking a large percentage of time due mainly to the calloc(3)
used in creating the struct hostname
used for matching purposes.
…View Comment
vsoch open issue converged-computing/rainbow#37.
ARM build needed for arm clusters that deploy rainbow
tpatki closed issue LLNL/variorum#544.
Incorrect GPU utilization parsing in var_monitor
We noticed in the hackathon yesterday that two GPUs are missing in the output CSV due to a minor parsing error. …View Comment
JustinPrivitera open issue LLNL/conduit#1279.
`generate_sides` for matsets
It would be good to have a path for material sets defined on polytopal topologies….View Comment
tzanio open issue mfem/mfem#4296.
Missing documentation
We should probably add the fem/integ
directory to the list of Doxygen documentation directories and check if other important directories are missing….View Comment
vsoch open issue converged-computing/rainbow-experiments#3.
Next spack experiments
- [ ] Fix bug that clusters are added to config (with empty token) …View Comment
BradWhitlock open issue LLNL/axom#1337.
Calling host function in host device function.
The following was reported on Teams. A host function is being called from a host_device function. In this case, the std::numeric_limits<>::max() function is being called and it is host-only. Calls like this could affect Axom on CUDA or HIP. …View Comment
markcmiller86 open issue visit-dav/visit#19506.
3.4.1 doesn't work with trackpad on macOS
### Describe the bug …View Comment
cmoussa1 closed issue flux-framework/flux-docs#105.
flux-accounting: add docs for building, installing, and setting up flux-accounting
Once flux-accounting v0.10.0
is released, it would be a good idea for me to add some official documentation on building/installing flux-accounting and setting it up on a machine. This should include instructions on some of the major components of flux-accounting, which include:
…View Comment
jameshcorbett open issue flux-framework/flux-coral2#161.
Enable more flexible lustre scheduling
If #157 goes in and changes the layout of the resource graph, it will enable jobspecs that look like this: …View Comment
markcmiller86 open issue HDFGroup/hdf5#4471.
h5ls cores on toss4 system @llnl compiled with IntelOne API
I am attaching a tar file with as much info as I can provide. …View Comment
samuelkgutierrez closed issue hpc/quo-vadis#35.
Cross-Paradigm Scope Acquisition
We probably need a cross-paradigm scope acquisition function. For example, threads acquiring and then operating on a scope created from within an MPI context. In this use case, the idea is that the threads want to operate within the hardware bounds defined by the scope created by the parent MPI process. …View Comment
alizma open issue LLNL/echemfem#14.
Documentation for fluid solvers?
The provided fluid solvers seem somewhat opaque and would benefit from further discussion. What are the permissible ranges of parameters? Maybe some references to how this works would also be helpful. …View Comment
vsoch closed issue converged-computing/converged-computing.github.io#4.
Add contributors badges to README
tgamblin closed issue spack/spack#43267.
post-solve `output.satisfies(input)` check returns false positive when input is (partially) concrete
### Steps to reproduce …View Comment
adayton1 open issue LLNL/axom#1332.
Vector normalization
Axom currently handles normalizing the zero vector (and tiny vectors) by returning {1, 0, 0}. Is this really the behavior we want? …View Comment
hppritcha open issue mpi-forum/mpi-issues#849.
PR 938 needs a changelog entry for MPI 4.2
# Problem …View Comment
chu11 open issue chaos/cerebro#25.
socket buffer size could be increased
On very large clusters we’ve seen listeners lose packets (grep for port 2293 (8851 in hex) on /proc/net/udp). …View Comment
tpatki open issue LLNL/variorum#543.
Variorum get_current_version reports old version 0.5.0
Tested on Lassen with commit 50b21f on 5/2. Need to update version. …View Comment
tpatki open issue LLNL/msr-genie#10.
Add link to paper in README
https://dl.acm.org/doi/abs/10.1145/3624062.3624146…View Comment
jameshcorbett open issue flux-framework/flux-sched#1193.
Match policies malfunctioning on rabbit-like graph
Using the match policies low
and lownode
on this JGF representation of rzvernal: rzvernal_R_norabbit.json
…View Comment
jedbrown closed issue rsmpi/rsmpi#182.
Should ready send be unsafe?
Ready send is quite unsafe, as I’ve documented in https://github.com/rsmpi/rsmpi/commit/1e5eeabc7f2b7f55976d3bc8df0867a229d03425 after someone accidentally misused it. @jtronge @hppritcha :wave: Do you feel like there is any way to make a safe ready send (and would that be worth it)? My biggest concern here is that while misuse can often be reported by MPI, it is not even deterministically knowable whether the a receive has been posted before a ready send is initiated. I think there are some symbolic execution/formal methods techniques that can be applied to this problem, but I don’t see a way to get that sort of thing into the Rust type system. Also, I think ready-send is quite a niche that we shouldn’t be encouraging users to use. Shall I make the ready send interfaces unsafe
for mpi-0.8?…View Comment
BradWhitlock closed issue LLNL/axom#1324.
Find polygon by intersecting tet with a plane
@adayton1 requested a method to find the polygon resulting from the intersection of a tetrahedron with a plane. This would enable EL3D to use Axom to underpin their 2D ProE shaping. Typically, ALE3D intersects a tet with the plane z = 0, obtaining a 2D polygon (dropping the z coordinate), then computes the volume of intersection of that polygon with nearby quadrilaterals….View Comment
chu11 open issue chaos/powerman#184.
redfishpower: add workaround for libcurl DNS race
A DNS race in libcurl exists in versions before 7.66, it can lead to redfishpower returning “network error” for power operations (“ Timeout was reached” after PR #183). It appears to exist in RHEL libcurl-7.61.1-33.el8_9.5.x86_64. …View Comment
imguoguo open issue riscv-software-src/opensbi#353.
System freezes when setting smp value to 54 or higher with qemu-system-riscv64
Description: …View Comment
jeremylt open issue freeCodeCamp/news#877.
Menu Button on News Platform
### Describe the Issue…View Comment
wihobbs open issue flux-framework/flux-test-collective#21.
add multiple brokers per node
@grondo caught an issue in intel-MVAPICH on corona mentioned in flux-core#5912 that shows up when running multiple brokers per node. That’s a configuration we should be testing….View Comment
adayton1 closed issue LLNL/axom#1307.
Add test cases for finding the closest point on a triangle
My application is getting fairly different answers in some cases when using the closest_point query. I noticed there are no test cases for the triangle case. Having a few test cases would definitely increase my confidence that closest_point is actually doing the right thing. …View Comment
IanLee1521 open issue elastic/support-diagnostics#690.
Add support for sanitizing kibana and logstash bundles
Right now, the sanitization script / readme states: …View Comment
cmoussa1 open issue flux-framework/flux-accounting#445.
docs: move flux-accounting guide to this repo
The flux-accounting guide in flux-docs should be moved to this project instead of living in flux-docs….View Comment
cyrush open issue Alpine-DAV/ascent#1278.
all reuse of past actions
if user knows actions won’t change, provide a path to avoid checking for ascent_actions.{yaml,json}, and use last set of actions….View Comment
chu11 open issue flux-framework/flux-core#5906.
cron: race may cause cron jobs to never run again
While playing around with flux cron
for #5905 noticed a corner case (or possibly 2).
…View Comment
grondo open issue flux-framework/flux-core#5904.
flux-perilog-run exits silently with failure when one or more ranks are not online
While debugging flux-framework/flux-sched#1182 it took much longer than necessary to determine what was going on because the prolog was failing silently. …View Comment
IanLee1521 open issue elastic/integrations#9648.
New OpenShift Integration?
Hi there – …View Comment
aperloff open issue LLNL/SSAPy#1.
SSAPy v1.0.0 not properly closing the lunar harmonics file?
Describe the bug …View Comment
cyrush open issue visit-dav/visit#19456.
color tables missing between 3.3.3 and 3.4.1
### Describe the bug …View Comment
adrienbernede open issue LLNL/radiuss-spack-configs#100.
Clean up old compilers.
cyrush open issue visit-dav/visit-website#19.
direct pushes are allow, but you get yelled at
when pushing you will see: …View Comment
eleon open issue hpc/quo-vadis#105.
Test double split operations
Create a program in tests
to test the following scenarios. Let’s say we have a dual socket node with 2 GPUs per socket. We also have four MPI tasks.
…View Comment
jameshcorbett open issue flux-framework/flux-coral2#145.
dws-jobtap: handle dws service crashes
cjekel open issue huggingface/transformers#30056.
SDPA gives nans/infs during sampling on ROCM w/ float16
### System Info…View Comment
IanLee1521 open issue herrBez/elastic-blog-openshift-logging#5.
Meta: Upstream an integration?
@herrBez - this might or might not be a question for you, but what do you think about the idea of turning all of these snippets into a dedicated OpenShift agent integration? That way we could just add all of these pieces in a package instead of piecemeal. …View Comment
jamiebramwell open issue LLNL/serac#1099.
Add SLEPc for eigenvalue solvers
This should be fairly straightforward now that we build with PETSc. https://slepc.upv.es/…View Comment
adayton1 open issue LLNL/axom#1305.
Requesting closest_point operator for Segment
This is a request for a closest_point operator for finding the closest point on a Segment. …View Comment
cyrush open issue visit-dav/visit#19435.
visit_utils.engine.open(method=`slurm`) --> use SLURM_NPROCS
### Describe the bug …View Comment
markcmiller86 open issue LLNL/Silo#372.
toc object returned from python interface not iterable or introspectable
In the python interface, DBGetToc()
returns a toc
object that that object is useless except to print. A program needs to be able to iterate over it, introspect it, etc….View Comment
cyrush open issue Alpine-DAV/ascent#1266.
check that visit session converter works with python3
check current status of visit session python and add a unit test for the converter to make sure it stays healthy….View Comment
jameshcorbett open issue flux-framework/flux-coral2#139.
Draining compute nodes that lose connection with their rabbit
Sometimes compute nodes can lose their connection with their rabbits. When this happens they should be drained. Although compute-only jobs could still run on those nodes, Brian Behlendorf and @garlick seemed to agree in separate offline discussions that draining the nodes will get admins to investigate; if the nodes were not drained, they might not notice….View Comment
jameshcorbett open issue NearNodeFlash/NearNodeFlash.github.io#146.
Testing / kind environment: marking `Storage` compute nodes as offline
For testing purposes it would be nice if there was a mechanism to mark compute nodes as “Offline” in the Storage resource so that I could check that Flux handles the update correctly. …View Comment
JustinPrivitera closed issue LLNL/conduit#1265.
generate_sides needs w/ fields output needs to be robust to no fields input
This will fail if the source mesh lacks fields: …View Comment
adayton1 open issue LLNL/RAJA#1618.
Support assignment operator for layouts (and views)
RAJA/util/TypedViewBase.hpp:481:23: …View Comment
vsoch open issue converged-computing/ensemble-operator#6.
Design Thinking: submitting jobs
I’m looking at a traditional “submit this job” minicluster, e.g., here is the current lammps: …View Comment
chu11 closed issue chu11/freeipmi-mirror#69.
don't use MAXHOSTNAMELEN for hostname buffer size
see https://savannah.gnu.org/bugs/index.php?64824…View Comment
uvilla open issue pecos/tps#261.
Assign GPU to MPI rank
The current way that in TPS we assign a GPU to each MPI rank is (see here ): …View Comment
jandrej open issue EnzymeAD/Enzyme#1802.
Injected headers for c++ break tooling
The injected headers in https://github.com/EnzymeAD/Enzyme/blob/main/enzyme/Enzyme/Clang/include_utils.td break tooling like clangd for autocompletion. …View Comment
jeremylt open issue CEED/libCEED#1532.
Drop duplicate headers
Loading source to a string for JiT currently chases all local includes. It would help for debugging if we make sure we only include each file once….View Comment
garlick open issue chaos/powerman#165.
a powerman.dev send string wtih stray printf tokens can cause a segfault
From powerman.dev(5): …View Comment
alizma closed issue METHODS-Group/DRDMannTurb#44.
Test Suite
We should include tests for basic functionality for …View Comment
tpatki open issue LLNL/variorum#532.
Add support for print and JSON APIs for GPU energy values
vsoch open issue go-hep/hep#1010.
distributed use cases?
Hi! I’m looking for a go framework to run experiments on an HPC cluster, and I’m wondering if any of your simulations are amenable to that? Basically I can have a set of connected nodes and (if the library supports something like distributed) I’m hoping to use them. Thanks!…View Comment
vsoch closed issue flux-framework/flux-operator#211.
Refactor v0.2.0: Read if you want to use the Flux Operator!
We have a WIP pull request that will make several improvements to the Flux Operator, and if you are using for the first time we recommend you use this version. The reason we aren’t merging is to await a paper (that details notes from the previous version). …View Comment
jedbrown open issue rust-lang/mdBook#2337.
Codegen/profile options in configuration
### Problem…View Comment
jandrej closed issue mfem/mfem#4186.
QuadratureSpace and FinitelElementSpace discrepancy for GetVDim
QuadratureSpace
doesn’t track the vdim whereas FinitelElementSpace
does.
…View Comment
cyrush open issue Alpine-DAV/ascent#1257.
issue building vtk-m with gcc (8.3.1) + cuda (10 or 11) on LLNL lassen
Compiler info: …View Comment
agcapps closed issue LLNL/axom#1210.
Read bbox subset of mesh
Users of a code have been spotted reading in huge ProE mesh files with abandon, then complaining when they run out of memory. (“All die. O the embarrassment.”) My first answer is “well, don’t do that.”
…View Comment
cyrush open issue visit-dav/visit#19385.
timeslider output format bug
### Describe the bug …View Comment
markcmiller86 closed issue markcmiller86/hello-numerical-world#6.
Critical Documentation
The update_solution_* functions should have doc-strings in the code so that people can modify and test them. Currently there is minimal documentation. …View Comment
gardner48 closed issue LLNL/sundials#211.
[RFC] Request for comment on a proposal to require a (almost) C99 compliant compiler to build SUNDIALS
The SUNDIALS team is requesting comments in regard to the following proposal. Specifically, we are interested in comments about negative impacts the proposal may have on our users (concrete or otherwise). …View Comment
adrienbernede open issue LLNL/radiuss-shared-ci#35.
Re-running "release allocation" job should be fail-proof.
We don
cyrush open issue visit-dav/visit#19368.
detect ssh key accept prompts when logging into a new system
### Describe the bug …View Comment
alecbcs closed issue spack/spack-gantry#15.
Tooling/config updates
- update pyproject.yaml
…View Comment
tpatki open issue LLNL/variorum#522.
Rename powmon to var-monitor
Considering the added functionality in powmon
, renaming it to var-mon
or var-monitor
may be a good idea….View Comment
jameshcorbett open issue flux-framework/flux-core#5771.
Transient `no job-ingest.submit service` error on elcap
Yesterday afternoon between 17:55 and 17:56 Brian Behlendorf submitted a bunch of jobs on elcap and they all failed with flux-run: ERROR: No service matching job-ingest.submit is registered
.
…View Comment
balos1 closed issue LLNL/sundials#249.
Values may overflow in SUNHashMap on Windows
See warnings in https://github.com/LLNL/sundials/actions/runs/3861321674/jobs/6582228224. …View Comment
gardner48 closed issue LLNL/sundials#409.
Type conversion (maybe corrupting data)?
Hi …View Comment
cmoussa1 open issue flux-framework/flux-accounting#426.
plugin: move accounting-specific helper functions to `accounting.cpp`
The priority plugin has defined a number of helper functions that are specific to flux-accounting, and thus would make more sense if they were instead defined in accounting.cpp
. After a perusal through the plugin code, the functions that make the most sense to move out would be check_queue_factor ()
and check_map_for_dne_only ()
. …View Comment
correaa open issue openbabel/openbabel#2675.
Make OpenBabel read pymatgen format
Is your feature request related to a problem? Please describe. …View Comment
vsoch open issue converged-computing/jsongraph-go#7.
serialization to json needs to remove element structure
vsoch open issue c0mm4nd/go-hwloc#1.
topology can't be allocated in Go (Is this being maintained)?
I just tried adding to my library, and: …View Comment
vsoch open issue flux-framework/fluxion-go#7.
Conversion of JGF v2 to JGF v1
Fluxion currently requires JGF version 1. The version 2 is a better design that allows for indexing the nodes based on name (as opposed to a list) and there are subtle differences in the design. I would prefer to continue developing tools (e.g., compspec-go) anticipating v2, so I think an OK strategy for the time being will be to allow fluence to accept a v2 and convert to v1 to give to flux. I’ll be able to implement this shortly when I’m ready (I’m currently adding the export of the cluster graph from compspec-go, which I’m going to use version 2 for). This will mean that: …View Comment
cmoussa1 open issue flux-framework/flux-accounting#424.
`job.state.priority`: use new external function for association lookup, general function improvement
Continuing the various improvement surrounding the priority plugin code, the job.state.priority
callback could stand to be improved by making use of the new external function(s) defined in accounting.cpp
, as well as a few general comment and function description improvements….View Comment
jedbrown open issue CGNS/CGNS#754.
Format specifier macros for cgsize_t
HDF5 has the following definitions, which allow reading and formatting of type hsize_t
.
…View Comment
cyrush open issue visit-dav/visit#19343.
3.4.1 release build tracker
Tracking Issue for all 3.4.1 release builds done by LLNL team. …View Comment
vladotomov closed issue mfem/mfem#4144.
How to generate the right-hand side using an analytical solution?
Hi
tpatki open issue flux-framework/flux-sched#1146.
Performance Data Collection for Fluxion
Use either Caliper or PerfFlowAspect to annotate the key Fluxion functions to gather scheduling overhead data. …View Comment
tgamblin closed issue archspec/archspec#113.
Imprecise information returned from archspec.cpu.host() on Windows
On windows, archspec could return more significant info. It seems to provide default information that isn’t detailed for the current system. My understanding is that there should be similar outputs for windows and WSL2 since they are on the same physical machine. …View Comment
cyrush open issue LLNL/uberenv#129.
python3.12 dropping distutils
distutils is dropped in python 3.12 …View Comment
cyrush open issue visit-dav/visit#19328.
cant build vtk-9 with qt6 on toss4
### Describe the bug …View Comment
tgamblin closed issue spack/spack#40408.
All submodules unexpectedly fetched
#27293 introduces a method of a package defining specific submodules to update. If the package requests no submodules be updated by sending back an empty list, all submodules are updated….View Comment
samuelkgutierrez open issue hpc/quo-vadis#78.
Exchange Versions During Handshake
grondo open issue flux-framework/flux-core#5746.
job shell blocks at exit in degraded job
Problem: A Flux job can survive loss of a node or job shell if the shell rank does not intersect with the critical-ranks
set. However, when that job eventually exits “normally” the remaining shells hang at exit, presumably waiting for the lost shell or shells to enter the exit barrier.
…View Comment
tpatki open issue flux-framework/PerfFlowAspect#132.
Need to add CMake setup files
We need the equivalent of SetupJansson.cmake in this repository. …View Comment
IanLee1521 open issue LLNL/toss-stig#46.
TOSS-04-020070 - MFA options
Make explicit to document with ISSO if using something else….View Comment
jamiebramwell open issue LLNL/serac#1080.
Investigate why DSuperLU can't be used when systems are resized
I had some issues with SuperLU in the tests where if the matrix was resized, then SuperLU wouldn’t work. This is an issue in SolidMechanics::warmStartDisplacement()
where only the displacements are updated without the LM block. I only tested with strumpack and SuperLU, but if MUMPS/Pardiso/etc. are added to serac in the future, this macro may need to be updated…
…View Comment
JustinPrivitera closed issue LLNL/conduit#1243.
Refactor Generate Sides code to take advantage of `data_accessor` set
Now that #1143 is complete, we can finish cleaning up generate_sides templating. See #1138 for more info and discussion….View Comment
JustinPrivitera open issue visit-dav/visit#19313.
The tests are broken: `No module named 'visit_testing'`
``` …View Comment
jakubcerveny closed issue jakubcerveny/gilbert#7.
Implement `d2xy`, `xy2d`, `d2xyz`, `xyz2d`
It would be nice to have stand alone conversion functions that convert from spatial dimensions to position along the curve (xy2d
, xyz2d
) and vice versa (d2xy
, d2xyz
).
…View Comment
correaa open issue QMCPACK/qmcpack#4921.
some deterministic tests fail
Describe the bug …View Comment
correaa open issue eliaskosunen/scnlib#101.
Library not working in Godbolt Compiler explorer
I am not sure if you have added the library to Godbolt, but in case you did, it is not working. …View Comment
cyrush closed issue LLNL/conduit#1238.
Use O2M index for matset converters
https://github.com/LLNL/conduit/blob/develop/src/libs/blueprint/conduit_blueprint_o2mrelation_index.hpp…View Comment