Tuesday, January 15, 2019

PrintSafepointStatics is missing in JDK 12

A quick note that the Java flag -XX:+PrintSafepointStatics has been obsoleted in OpenJDK 12. Using it will cause the VM to exit, printing:
Unrecognized vm option 'PrintSafepointStatics'
The flag has been replaced by extending "Unified Logging" to handle safepoint statistics as well:
-Xlog:safepoint+stats=debug
The flag was removed as part of the bug fix:
  • JDK-8198720 "Obsolete PrintSafepointStatistics, PrintSafepointStatisticsTimeout and PrintSafepointStatisticsCount options"
This allows the normal "UL" controls to send logging to a file, integrate a number of different log types, etc. But it does change the previous output format. It looks like the other two flag's functionality is gone. See the bug fix.

Some background on safepoints:
This is not a definitive list of resources, but what I found quickly online.

Wednesday, November 28, 2018

Cheat Sheet for cpuinfo features on AArch64 (Arm64)

I couldn't find this posted anywhere, so here it is (also posted to WikiChip).

ARMv8 has many versions (ARMv8.1, etc), which define mandatory and optional features. The Linux kernel exposes the presence of some of these features via hwcaps. These values are displayed in /proc/cpuinfo.

So if you "grep Features /proc/cpuinfo", you may get a result like:

Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid asimdrdm

This table shows the feature name, what version introduced the feature, and a short description.

Name Versions Feature Supported
fp ARMv8.0 Single-precision and double-precision floating point.
asimd ARMv8.0 Advanced SIMD.
evtstrm N/A Generic timer is configured to generate "events" at frequency of about 100KHz.
aes ARMv8.0 AES instructions (AESE, etc)
pmull ARMv8.0 Polynomial Multiply Long instructions (PMULL/PMULL2)
sha1 ARMv8.0 SHA-1 instructions (SHA1C, etc)
sha2 ARMv8.0 SHA-2 instructions (SHA256H, etc)
crc32 [ARMv8.0], ARMv8.1 ... CRC32/CRC32C instructions
atomics ARMv8.1 ... Large System Extensions (LSE) - (CAS/SWP/LD[op])
fphp ARMv8.2-FP16 Half-precision floating point.
cpuid N/A Some CPU ID registers readable at user-level.
asimdrdm ARMv8.1 Rounding Double Multiply Accumulate/Subtract (SQRDMLAH/SQRDMLSH)
jscvt ARMv8.3 Javascript-style double->int convert (FJCVTZS)
lrcpc ARMv8.3 Weaker release consistency (LDAPR, etc)
dcpop ARMv8.2 Data cache clean to Point of Persistence (DC CVAP)
sha3 ARMv8.2-SHA SHA-3 instructions (EOR3, RAXI, XAR, BCAX)
sm3 ARMv8.2-SM SM3 instructions
sm4 ARMv8.2-SM SM4 instructions
asimddp ARMv8.2-DotProd SIMD Dot Product
sha512 ARMv8.2-SHA SHA512 instructions
sve ARMv8.2-SVE Scalable Vector Extension (SVE)

This should be complete up until ARMv8.4 features (ran out of time).

See kernel docs and the ARMv8-A Reference Manual for (way) more details...

Thursday, April 12, 2018

Where to get JDK 10 for ARM?


Oracle does not provide JDK 10 binaries for AArch64, so where can you get one?

OpenJDK on AArch64 is actively supported by ARM, Linaro, Red Hat, Cavium and Bellsoft, Qualcomm, and others, but the process of producing publicly available binaries is a bit more patchwork than I'd like. Here are sources that have, or will have, AArch64 builds of OpenJDK:

Works on Arm has a good database of software that runs on AArch64, including Java:
  • https://www.worksonarm.com/explore/openjdk/
If you have additional information please comment here or Tweet me @drwhite. I'll update this post.

See  JDK 10 - What's new for AArch64? for more info.

Updates:
  • 7/25/2018: Updated AdoptOpenJDK links for new JDK 10 release builds.
  • 7/25/2018: Updated Docker links to specify arm64v8.

JDK 10 - What's new for AArch64?

Oracle has announced the release of Java SE 10, and listed the major new features including:
  1. (JEP 286) Local-Variable Type Inference: Enhances the Java Language to extend type inference to declarations of local variables with initializers. It introduces var to Java, something that is common in other languages.
  2. (JEP 204) Garbage Collector Interface: Improves the source code isolation of different garbage collectors by introducing a clean garbage collector (GC) interface.
  3. (JEP 307) Parallel Full GC for G1: Improves G1 worst-case latencies by making the full GC parallel.
  4. (JEP 301) Application Data-Class Sharing: To improve startup and footprint, this JEP extends the existing Class-Data Sharing ("CDS") feature to allow application classes to be placed in the shared archive.
  5. (JEP 312) Thread-Local Handshakes: Introduce a way to execute a callback on threads without performing a global VM safepoint. Makes it both possible and cheap to stop individual threads and not just all threads or none.
  6. (JEP 317) Experimental Java-Based JIT Compiler: Enables the Java-based JIT compiler, Graal, to be used as an experimental JIT compiler on the Linux/x64 platform.
So that's fine for those fringe x86 developers, but what about you, the mainstream Java developer running on AArch64? All of the features listed above apply to AArch64, except for JEP 317 Experimental Java-Based JIT Compiler (this is in progress).

Other AArch64 Improvements and Bug Fixes in JDK 10

The AArch644 Porting Community, including Linaro, Red Hat, Cavium (with Bellsoft), and others, have added a number of improvements and bug fixes for AArch64.

Improvements:
  •     JDK-8186915 AARCH64: Intrinsify squareToLen and mulAdd
  •     JDK-8184943 AARCH64: Intrinsify hasNegatives
  •     JDK-8190336 Make sure that AppCDS works on aarch64 platform
  •     JDK-8158361 AArch64: Address calculation missed optimizations
  •     JDK-8169697 aarch64: vectorized MLA instruction not generated for some test cases
  •     JDK-8163011 AArch64: NMT detail stack trace cleanup
  •     JDK-8189745 AARCH64: Use CRC32C intrinsic code in interpreter and C1
  •     JDK-8189596 AArch64: implementation for Thread-local handshakes
  •     JDK-8189439 Parameters type profiling is not performed from aarch64 interpreter
  •     JDK-8188221 Return type profiling is not performed from aarch64 interpreter
  •     JDK-8189176 AARCH64: Improve _updateBytesCRC32 intrinsic
  •     JDK-8189177 AARCH64: Improve _updateBytesCRC32C intrinsic
  •     JDK-8185786 AArch64: Disable some address reshapings
  •     JDK-8179444 AArch64: Put zero_words on a diet
  •     JDK-8178968 AArch64: Remove non-standard code cache size
  •     JDK-8184049 AArch64: Matching rule for ubfiz
  •     JDK-8184964 Arch64: Incorrect match rule for negL_reg
  •     JDK-8182161 arch64: combine andr+cbnz into tbnz when possible
  •     JDK-8183547 Arch64: Better instruction sequence for stack bangs
  •     JDK-8183533 AArch64: redundant registers saving in arraycopy stubs
  •     JDK-8182583 AArch64: FMA Vectorization on aarch64
  •     JDK-8179701 AArch64: Reinstate FP as an allocatable register

Bugs:
  •     JDK-8175367 Wrong assert for UseCompressedOops in aarch64 Copy::conjoint_oops_atomic implementation
  •     JDK-8191955 AArch64: incorrect prefetch distance causes an internal error
  •     JDK-8191129 AARCH64: Invalid value passed to critical JNI function
  •     JDK-8191769 AARCH64: Fix hint instructions encoding
  •     JDK-8186325 AArch64: jtreg test hotspot/test/gc/g1/TestJNIWeakG1/TestJNIWeakG1.java SEGV
  •     JDK-8186438 'configure' fails to find installed libfreetype on Ubuntu AArch64
  •     JDK-8184900 Arch64: Fix overflow in immediate cmp instruction
  •     JDK-8182581 aarch64: fix for crash caused by earlyret of compiled method
  •     JDK-8187022 AArch64: UBFX instructions have wrong format string
  •     JDK-8179933 AArch64: Incorrect match rule for immL_255
  •     JDK-8198950 AArch64: org.openjdk.jcstress.tests.varhandles.DekkerTest fails

Lots of good stuff for 64-bit ARM!

This list ignores fixes for broken builds, "backports" from JDK 9 -> JDK 10. The categories are my own quick opinion of improvement vs bug. See my "AArch64 fixes in JDK10" filter for raw data.

Coming Attractions for AArch64 in JDK 11:

Here are a few things that are already in JDK 11:
  • JDK-8196064 AArch64: Merging ld/st into ldp/stp in macro-assembler 
  • JDK-8193260 AArch64: JVMCI: Implement trampoline calls
  • JDK-8196590 Enable docker container related tests for linux AARCH64
  • JDK-8190428 Minimal Dynamic Constant support for AArch64
  • JDK-8198293 AARCH64 - Add CPU detection code for Cavium Thunder X2
  • JDK-8187472 AARCH64: array_equals intrinsic doesn't use prefetch for large arrays. (Also adds SIMD impl)
In addition, JEP 315 Improve Aarch64 Intrinsics is in the works.

You may want to see "Where to get JDK 10 for ARM?"

Disclaimer - even checked-in fixes may be removed before a release, so these are not guaranteed to be in JDK 11.

Tuesday, March 20, 2018

ThunderXStation

Wednesday, February 28, 2018

About Me

I'm Derek White. I work for Marvell / Cavium in Massachusetts.
The views expressed on this blog are my own and do not necessarily reflect the views of Marvell, Inc.

I am the JVM team lead at Marvell, working to make Java and the JVM work really well on the AARCH64 archicture (64-bit ARMv8) in general, and on Marvell/Cavium hardware like the ThunderX2 in particular.

Previously I worked for Oracle on HotSpot (in the GC and Embedded JVM groups), for Oracle Labs on an IOT project based on hardware running Java, and for Sun Labs working on the Squawk JVM - a virtual machine for Java written almost entirely in Java, running on the Sun SPOT (as well as  a Cortex-M3).

Other things I've worked on in the past include:
  • Java simulation and performance analysis for Niagara processors (the short answer is "NOPs are bad").
  • Garbage collection and JVM performance issues at Sun Labs (the "Exact VM").
  • A JVM for an unnamed 64-bit OS at Novell.
  • The Dylan programming language and development environment at Apple.
  • The Object Pascal compiler at Apple.
This is my first blog post here. In the future look for posts on Java VM design, porting, performance analysis and tuning, debugging, and obscure cultural references.
Now I wonder where Ruth is?....