Open Source Heartbeat: Open Source Heartbeat

Open Source Heartbeat

hppritcha closed issue open-mpi/ompi#12080.

MPI 4.1 - need to implement MPI_GET_HW_RESOURCE_INFO

See https://github.com/mpi-forum/mpi-issues/issues/154 …View Comment

tzanio commented on issue mfem/mfem#5358.

This PR is now under review (see the table in the PR description). To help with the review process, please do not force push to the branch….

View Comment

JustinPrivitera commented on issue llnl/conduit#1620.

Closing in lieu of #1627 and #1625…

View Comment

artv3 commented on issue llnl/RAJAPerf#693.

Nice @michaelmckinsey1 ! What would be neat is to add a batching parameter and allow for different batch sizes, maybe 4 is good GPU X or 7 is good for GPU Y type of thing …

View Comment

artv3 commented on issue llnl/RAJA#2009.

@llnl/raja-core , building on the comments and suggestions, the latest push explores the following function to be used in RAJA::launch : …

View Comment

garlick commented on issue flux-framework/flux-core#7681.

Updates: …

View Comment

grondo commented on issue flux-framework/flux-core#7674.

I’ve set MWP here….

View Comment

markcmiller86 commented on issue visit-dav/visit#20971.

@cyrush if you a min, pls take a look. Artifact handling follows newest guidance from MS and other changes may reduce (I am still evaluating LFS budget usage reports to be sure) LFS BW usage….

View Comment

balos1 commented on issue llnl/sundials#961.

@BarrySmith this should fix the issue you were having in #935 …

View Comment

cyrush commented on issue llnl/conduit#1625.

The simplified templates look good – Note: think we want to keep dispatch with the functor as a just in case. …

View Comment

cmoussa1 commented on issue flux-framework/flux-accounting#883.

Thanks a bunch for the feedback here @jameshcorbett! Setting MWP here…

View Comment

cyrush commented on issue Alpine-DAV/ascent#1735.

To test, we can use: …

View Comment

JustinPrivitera commented on issue visit-dav/visit#19544.

reopening so that we see this since it was marked un-reviewed….

View Comment

imguoguo closed issue sipeed/picoclaw#3070.

[Security] OneBot inbound media URL handling allows host-side arbitrary fetch in PicoClaw

### Advisory Details…View Comment

camierjs commented on issue mfem/mfem#5230.

Some failures are hit on GPU with ./pgpu_unit_tests [ParticleSet] triggered by the CHECK(code_N_count == 0); of tests/unit/fem/test_particleset.cpp with expansion: 1 == 0….

View Comment

trws commented on issue flux-framework/flux-sched#1509.

Yeah, that hardcoded list of subsystems should never have existed. I’ll add it to the list of things to fix for subsystems….

View Comment

wihobbs commented on issue flux-framework/flux-sched#1502.

> so I guess the docs are not being built by default? …

View Comment

milroy commented on issue flux-framework/flux-sched#1505.

> In the resource graphs used to repro the issue (e.g. hetchy_jgf_formatted.json) there are many (32 I think?) ssd vertices per chassis, so even if the Fluxion can’t match multiple slots on a single SSD vertex, the job should still match. Am I missing something? …

View Comment

garlick commented on issue flux-framework/flux-sched#1502.

Nice! Content-wise this seems like a good start. …

View Comment

wihobbs closed issue flux-framework/flux-docs#130.

reorganize top-level site

We have some good documentation on the docs site but the ad-hoc growth of the documentation has resulted in lack of coherence and flow of the overall site. Also given that Flux is starting to be installed as a system RM at this point, we may get many more users that are most interested in “how do I use a cluster running Flux” vs “how do I build this new thing from source and poke at it” …View Comment

chu11 commented on issue flux-framework/flux-core#7668.

haven’t looked at the code yet, but it looks like in this proposal we would be changing the database schema. If we go forward with this general approach, would we like to consider other schema changes while we are at it? I’m mostly thinking of:…

View Comment

garlick commented on issue chaos/diod#171.

Np! Thanks for the link!…

View Comment

sam-maloney commented on issue Vanille-N/meander.typ#6.

> if you call par explicitly then the new paragraph splitting logic should be capable of correctly indenting it….

View Comment

adrienbernede commented on issue llnl/CARE#378.

@lc-hubcast run pipeline…

View Comment

grondo commented on issue flux-framework/rfc#523.

As I commented in flux-framework/flux-core#7615, this won’t work because it would break backwards compatibility: the scheduler and resource modules of subinstances fetch the R key from the parent. This change would break launching a previous version of Flux under the new version….

View Comment

tpatki commented on issue flux-framework/flux-multi-cluster-utilities#8.

Closing as the other PRs supersede this one….

View Comment

jameshcorbett commented on issue flux-framework/flux-multi-cluster-utilities#24.

Hmmm, mergify config not yet complete. Will just merge manually while I figure it out….

View Comment

sam-maloney commented on issue flux-framework/flux-core#7677.

At a basic housekeeping level, eventually the commit messages will have to follow the formatting requirements: https://flux-framework.readthedocs.io/projects/flux-rfc/en/latest/spec_1.html#patch-requirements…

View Comment

emily-howell commented on issue Alpine-DAV/ascent#1494.

@jfavre I have made a PR #1736 that addresses a bug I found relating to this ticket when trigger actions being passed via ascent_actions.yaml. I wanted to confirm if this is the same bug you encountered here. If not, could you provide more information for me to be able to replicate this bug….

View Comment

sam-maloney commented on issue spack/spack-packages#5177.

Realized I was missing a colon in the when clause of flux_sched/package.py at L102, because I had copied the previous line which was also missing it :laughing: and there were 2 lines with when="@0.19.0" so I changed the second one to when="@0.20.0" as that seems to make sense with the release dates and the commit when it was added….

View Comment

adayton1 commented on issue llnl/axom#1885.

Is there a way to iterate over the objects in the variant array without first converting to a vector?…

View Comment

sam-maloney commented on issue flux-framework/flux-sched#1504.

> Works for me, may need to check the spack package for consistency after this. …

View Comment

chu11 closed issue chu11/freeipmi-mirror#81.

groff warnings in manpages

Hi, QA checks on Debian spotted groff warnings in manpages, I did a fast patch in Debian I used to fix them, if you want, look at it:…View Comment

jedbrown commented on issue rsmpi/rsmpi#224.

Thanks, can you add the # Standard section(s) like all the other public functions? …

View Comment

mplegendre closed issue llnl/Spindle#189.

Careless mistake in commpath PR

A reminder that I need to be compiling with -Wall -Wextra -Werror….View Comment

tzanio commented on issue mfem/mfem#5352.

This PR is now under review (see the table in the PR description). To help with the review process, please do not force push to the branch….

View Comment

jameshcorbett commented on issue flux-framework/flux-sched#1490.

> @jameshcorbett can comment more specifically, but my understanding is that the flux-coral2 module modifies the jobspec after the feasibility check is already done, so the job gets accepted then the jobspec is modified to contain the rabbit request. If you try that same command with --dry-run you should get the original jobspec….

View Comment

trws commented on issue flux-framework/flux-sched#1490.

@jameshcorbett can comment more specifically, but my understanding is that the flux-coral2 module modifies the jobspec after the feasibility check is already done, so the job gets accepted then the jobspec is modified to contain the rabbit request. If you try that same command with --dry-run you should get the original jobspec….

View Comment

grondo commented on issue flux-framework/flux-core#7660.

View Comment

benzwick closed issue A2B-Technology-Corporation/Talk2View-Writer#33.

macOS: make install-oxt fails

JustinPrivitera closed issue visit-dav/visit#4976.

Revolve operator crashes with curvilinear meshes

### Describe the bug …View Comment

markcmiller86 commented on issue visit-dav/visit#21000.

@brugger1 is this ok now?…

View Comment

hppritcha commented on issue spack/spack-packages#5169.

i thought these issues were addressed via https://github.com/spack/spack-packages/pull/5148 ….

View Comment

v-dobrev open issue mfem/mfem#5363.

[CUDA] nvcc / cicc memory exhaustion (OOM) compiling element assembly kernels with -arch=sm_120 (CUDA 12.8)

# cicc memory ballooning / OOM compiling element-assembly kernels for sm_120 (Blackwell) with CUDA 12.8…View Comment

adrienbernede closed issue llnl/radiuss-spack-configs#175.

[spack-packages-tracker:shared-ci] Update spack-packages reference to 599874d0ccdf

Automated weekly check detected relevant changes in spack/spack-packages for shared-ci….View Comment

JustinPrivitera open issue llnl/conduit#1614.

Conduit Device Support Ongoing Development

## High-Level Goals…View Comment

markcmiller86 commented on issue llnl/H5Z-ZFP#168.

> From my reading it looks like the test is just trying to corrupt a bit of the raw data towards the start of the dataset’s data (now the first chunk in this case)? If so, this looks fine to me. …

View Comment

grondo commented on issue flux-framework/rfc#521.

Sure. I had considered doing that but figured someone looking a job eventlog might naturally think to look at the “Job States and Events” RFC. I didn’t see anywhere mentioned that this was the job manager RFC. Easy enough to add a new RFC….

View Comment

garlick commented on issue flux-framework/rfc#521.

I’m surprised we haven’t done this already! …

View Comment

milroy commented on issue flux-framework/flux-sched#1503.

> In v0.51.0 the jobspec matches with firstnodex….

View Comment

imguoguo closed issue sipeed/picoclaw#3080.

[Security] PicoClaw launcher `allowed_cidrs` can be bypassed through same-host loopback proxying during first-run setup

picoclaw: PicoClaw launcher allowed_cidrs` can be bypassed through same-host loopback proxying during first-run setup…View Comment

cjvogl commented on issue mfem/mfem#4905.

Closing this PR in lieu of #5362 that will include all the reconstruction work here alongside the SAMRAI-MFEM coupling work….

View Comment

gardner48 commented on issue llnl/sundials#957.

When using an implicit method with the default nonlinear solver (Newton’s method) you need to create and attach a linear solver before evolving the problem in time. The lines starting here in the ark_analytic.c example show how to do this for a dense linear solver (and matrix). There should be a runtime error in this case, but it seems there is a missing check….

View Comment

jeremylt commented on issue jeremylt/personal_website#3.

I need to read the Leiden Declaration and see if it applies here…

View Comment

wihobbs commented on issue flux-framework/flux-sched#1493.

I want to look into that failure, so removing MWP for now….

View Comment

sam-maloney commented on issue flux-framework/flux-sched#1501.

> If we require >= flux-core v0.78.0, then we could drop the old rc1 style scripts in etc/. In fact, there could be an implicit dependency on modprobe for the sched-fluxion-feasibility module already….

View Comment

garlick commented on issue flux-framework/flux-core#7467.

It might actually be better to just add a FLUX_MSGGFLAG_COMPOUND flag to indicate that the payload is compound (steal FLUX_MSGFLAG_USER1), and rather than requiring JSONL, use size prefixed segments. Then a message could contain multiple raw payloads, which could potentially avoid base64 encoding in KVS commit messages, and allow batching of content load/store payloads….

View Comment

cbritopacheco commented on issue cbritopacheco/rodin#294.

@copilot resolve the merge conflicts in this pull request…

View Comment

jeremylt commented on issue CEED/libCEED#1982.

hmm, this should be compared to the operator application kernel to make sure we are not breaking HipStar support …

View Comment

hppritcha commented on issue open-mpi/ompi#13956.

Yes I do agree….

View Comment

balos1 open issue llnl/sundials#955.

[BUG] arkLsMassSolve ignores some return values from user callbacks.

One of the folks working on the deal.II SUNDIALS interfaces, @vovannikov, has spent what I think must have been a quality afternoon tracing why one of our tests fails. The details are in https://github.com/dealii/dealii/pull/19838 but come down to code that I (mis-)wrote: If a user callback fails in user-provided linear solvers, I either returned +1 or -1, depending on whether it’s a recoverable or non-recoverable failure (and zero if everything is fine). I think that’s what the documentation of SUNDIALS at some point said I should do, but it’s possible that that was about the nonlinear solver callbacks in KINSOL and not about the SUNDIALS linear solvers. Regardless, for the linear solvers, I need to return SUNLS_PSOLVE_FAIL_REC or SUNLS_PSOLVE_FAIL_UNREC, with values +805 and -808, respectively….View Comment

samuelkgutierrez commented on issue hpc/quo-vadis#467.

From today’s meeting with Guillaume….

View Comment

cmoussa1 commented on issue flux-framework/flux-accounting#872.

Thanks @jameshcorbett! These were all great suggestions. I’ve gone ahead and force-pushed up some changes to this PR based on your feedback. I will set MWP here shortly…

View Comment

hppritcha commented on issue eschnett/mpif#1.

I will set up a poll and invite some folks so we can set up a good time for a call….

View Comment

alecbcs commented on issue spack/spack-packages#5104.

@spackbot run pipeline…

View Comment

hppritcha commented on issue open-mpi/ompi-www#563.

@edgargabriel oops forgot to add this file…

View Comment

alecbcs commented on issue llnl/hubcast-test#6.

@lc-hubcast help…

View Comment

artv3 commented on issue llnl/camp#205.

Used internally: https://github.com/llnl/camp/blob/56b194f3ea3d0e9d6e2f236de39a4886063e4968/include/camp/resource/hip.hpp#L187C5-L187C37 …

View Comment

adayton1 commented on issue llnl/Umpire#1072.

> Update: Hubcast doesn’t understand jobs that are allowed to fail. We can either fix the job failure or try to fix within Hubcast somehow.. stay tuned …

View Comment

artv3 commented on issue llnl/RAJA#2010.

Closing PR until next release…

View Comment

becker33 commented on issue spack/spack#52492.

I think we should actually hold off on merging these until we’re done with major backports to the 1.2 release branch. Even though we’ve cut the branch already, I don’t want to make backports any harder than they need to be between now and the release, and this one is more minor but this project as a whole touches a ton of files….

View Comment

hppritcha commented on issue open-mpi/ompi-scripts#95.

never mind this still needs some work….

View Comment

mplegendre closed issue llnl/Spindle#181.

Spurious testing false positives on randomly-generated names

https://github.com/llnl/Spindle/blob/cbf1d782bf07a1998904466b102a48e7dbbf4bf6/src/logging/spindle_logd.cc#L265…View Comment

jeremylt commented on issue libxsmm/libxsmm#855.

Understood …

View Comment

wihobbs commented on issue flux-framework/flux-core#6952.

I think @grondo and I have accomplished everything scoped here. Closing

tzanio commented on issue mfem/mfem#5019.

This PR is now under review (see the table in the PR description). To help with the review process, please do not force push to the branch….

View Comment

vsoch commented on issue kubeflow/trainer#3179.

@roberthluo the MPI test uses a wantinfo object to do comparison. The current flux tests do not. Here is the initial discussion:…

View Comment

markcmiller86 open issue visit-dav/visit#20987.

node_coords/X and node_coords/Y wrong for parallel decomposed Pixie

### Describe the bug…View Comment

v-dobrev commented on issue mfem/mfem#5346.

Hi @nmnobre, @najlkin, @tzanio, @helloworld922, …

View Comment

grondo open issue flux-framework/flux-pam#25.

`user@UID.service` fails to start with hidepid=2 and cgroupv2

On systems where /proc is mounted with hidepid=2 andsystemd.unified_cgroup_hierarchy=1 is set, user@UID.service fails to startfor users who are not members of the group whitelisted for /proc visibility (on TOSS, iseepids)….View Comment

garlick commented on issue flux-framework/flux-core#7659.

Oh, maybe I was looking at the same failure b/c mergify just merged this! NM….

View Comment

jeremylt commented on issue CEED/libCEED#1980.

Ah the correct fix is probably to update the libCEED hash PETSc is using …

View Comment

vsoch commented on issue urlstechie/urlchecker-python#95.

Spam…

View Comment

alecbcs commented on issue spack/spack-packages#5083.

Cool! Let’s just make sure we document that variable and alternatively script modifications for the release notes….

View Comment

becker33 commented on issue spack/spack#52492.

@haampie yes that was my plan…

View Comment

tgamblin commented on issue spack/spack#52397.

Agree! We are working on revamping the site – see what you think of https://id-preview–a538ff6f-a974-428e-b9b8-9b6a19dda20c.lovable.app (note that this is valid for only 7 days) which is a preview of how we think the site will eventually work….

View Comment

jandrej commented on issue mfem/mfem#5338.

I wrapped up the changes in #5232 and waiting for reviews. I’ll keep this open until the other PR is merged….

View Comment

jameshcorbett commented on issue flux-framework/flux-sched#1490.

Anecdotally, the error seems to occur when a user requests more nodes than there are rabbits in the system. The --coral2-chassis flag may help reduce the error rate by constraining node placements to fewer chassis. The error seems to have occurred around the time we switched clusters to using the firstnodex match policy….

View Comment

grondo open issue flux-framework/flux-core#7660.

`flux job attach` gets `ENOSYS` from `.proctable` 10 minutes into job</p>A user reported a mysterious failure of STAT on their job 10 minutes into its run: `flux job attach` failed with `Function not implemented` while the job was apparently running....View Comment

chu11 commented on issue flux-framework/flux-core#7656.

> This is why I’d suggested a few “safe” tests get ci=asan markers or something like ci=system. …

View Comment

hppritcha commented on issue open-mpi/ompi-www#559.

@edgargabriel ping…

View Comment

hppritcha commented on issue open-mpi/ompi#13598.

i’d say this is worth a try. anyway it looks like coverity has gone on an extended vacation. maybe codeql could provide some of the capabilities that we used to get with coverity?…

View Comment

cjvogl commented on issue mfem/mfem#4244.

> Regarding testing, it is easy in the way the bug appears at many places

balos1 closed issue llnl/sundials#937.

XXXGetRootInfo in sundials4py not setting rootsfound values

Dear developers,…View Comment

adrienbernede commented on issue llnl/radiuss-spack-configs#176.

@kab163 @adayton1 Thank you for that. Replacing was the right move….

View Comment

garlick commented on issue flux-framework/flux-sched#1497.

Just pushed the change to select the reader using hints from R.scheduling.writer. It looks like only rv1exec (no schedluing key) actually works with partial cancel presently - fine for now since that’s what the job manager will generate….

View Comment

jameshcorbett commented on issue flux-framework/flux-multi-cluster-utilities#22.

Another example would be a timeout exception. If the job times out, we definitely shouldn’t retry. Otherwise we wouldn’t be respecting the time limit restrictions placed on the job….

View Comment

cmoussa1 commented on issue flux-framework/flux-accounting#878.

I’ve rebased this to catch up after #880 - I’ll go ahead and set MWP here. Thanks again for reviewing this @jameshcorbett! …

View Comment

cyrush commented on issue visit-dav/visit#20972.

@biagas thanks for confirming!…

View Comment

cjvogl closed issue pnnl/SPAECIES#51.

Potential bug in gradient computation for "gamma" rain fall speeds

I believe there is a bug in the computation of the rain number fall speed derivatives using the “non-numerical” approach. Leveraging that…View Comment