Updated: April 21, 2019

Erich Focht

With VEOS 2.0.1 published mid January 2019 we finally get glibc support for VE programs. The glibc based environment should bring some benefits, it is more compatible to Linux and has better and more configurable memory management. We wanted to have glibc right from the start but it turned out that the code was so full of GNU-isms, that ncc could simply not compile it (and it is in good company with other great compilers). A purely scalar gcc port does the job now.

musl-libc based setups will be supported only until March 2019. You will need to recompile your codes after switching to the glibc environment.

VEOS 2.0.1 is only supporting glibc on Centos/RHEL 7.5 installations. Older versions (7.3/7.4) will get glibc support in VEOS 2.0.2 which will be released in February. So if you want to switch now to glibc, you must update to Centos/RHEL 7.5!

The VEOS yum repositories announced in this post have been updated to include the VEOS 2.0.1 packages. You can now easily update from there. This post will repeat some of the steps in that post, for the sake of completeness.

PREPARATION

Make sure you have the public GPG key for VE software imported:

rpm --import https://sx-aurora.com/repos/vesw.public.key
rpm --import https://sx-aurora.com/repos/RPM-GPG-KEY-TSUBASA-soft

Replace your yum repository configurations for VEOS (mostly local repository possibly called /etc/yum.repos.d/TSUBASA-local.repo) by following two repository configuration files:

/etc/yum.repos.d/veos-common.repo:

[veos-common]
name=Aurora VEOS Common
baseurl=https://sx-aurora.com/repos/veos/common
gpgcheck=1
enabled=1

/etc/yum.repos.d/veos-rhel7.repo:

[veos-rhel7]
name=Aurora VEOS RHEL Specific
baseurl=https://sx-aurora.com/repos/veos/RHEL7.5
gpgcheck=1
enabled=1

In the later repo config comment/uncomment the baseurl that corresponds to your variant of RHEL7 or CentOS7.

You should have access to a repository that contains the updated versions of the proprietary packages like SDK, compilers, MPI, NQSV, etc that are adapted to glibc-ve. This can be a local repository, on the same machine, or remote. Please ask your support contact at NEC in case you are not sure or did not receive the updated SDK, yet. The following steps imply that you have a yum repository configuration that points to an updated SDK packages repository.

UPDATING

The steps described below are following the Installation Manual.

Take the VE cards into maintenance mode

/opt/nec/ve/bin/vecmd state set off
/opt/nec/ve/bin/vecmd state set mnt

Stop VEOS, Monitoring and MMM

systemctl stop vemmd
/opt/nec/ve/veos/sbin/terminate-all-veos
systemctl stop mmm
rmmod ve_drv
rmmod vp

Update VEOS

If you are using Infiniband on your system for MPI involving Vector Engines, you must have installed the Mellanox OFED package as well as ve_peermem. You will need to deinstall both before doing the update:

yum remove ve_peermem
/usr/sbin/ofed_uninstall.sh

Now proceed to update VEOS (and possibly CentOS/RHEL 7.5):

yum update

yum group update veos-apprun veos-appdev --disableexcludes=all

NOTE (April, 2019): The original set of instructions recommended to install

# make sure glibc-ve is installed
yum install glibc-ve glibc-ve-devel

This should not be needed any more.

If a user already installed packages using yum install, then packages cannot be updated using yum group update command. Users will face this issue, when they follow the official installation guide. To fix it, do:

yum group mark convert veos-apprun veos-appdev mmm

Install the new, glibc based compilers:

yum remove nec-nc++-musl-inst-1.6.0 nec-nfort-musl-inst-1.6.0
yum install nec-nc++-2.0.8 nec-nfort-2.0.8 nec-nc++-inst-2.0.8 nec-nfort-inst-2.0.8

You also might want to install the newer MPI for glibc-ve. At the time of the release of VEOS 2.0.1 the package names contain the version number 2-0-0:

yum install nec-mpi-devel-2-0-0 nec-mpi-libs-2-0-0 nec-mpi-utils-2-0-0

On some systems we had to make sure some RPMs are installed (dependencies are maybe not complete):

yum install libgcc-ve-static libsysve-devel veos-devel

Reboot the system and wait for the nodes to come online.

Install MOFED, if needed

If you were using Infiniband before and had to deinstall Mellanox OFED, you must install now the appropriate version. From http://www.mellanox.com/page/products_dyn?product_family=26 download “Mellanox OFED 4.3-3.0.2.1”, the tarball, not the ISO.

Untar the tarball, enter the directory and type:

./mlnxofedinstall --add-kernel-support --kmp

Install the ve_peermem package that manages memory mappings between IB cards and VEs.

yum install ve_peermem

Update VMC Firmware

If the vmcfw package was updated it is recommended to update the VMC firmware of the VE card. The VE’s FW version can be checked with the command:

/opt/nec/ve/bin/vecmd info | grep VMCFW

If the version differs from the version of the vmcfw RPM, the VE’s FW should be updated while the cards are in maintenance mode and the VEOS related services are still stopped. The procedure is simple, but make sure the updating command is issued from within the directory containing the firmware:

cd /etc/opt/nec/ve/mmm/vmc

/opt/nec/ve/bin/vecmd state set off
/opt/nec/ve/bin/vecmd state set mnt
systemctl stop ve-os-state-monitor@*.service ve-os-launcher@*.service ve-ived

/opt/nec/ve/bin/vecmd fwup vmc aurora_MK10.bin

Reboot the VH after a FW upgrade.

Bring VE cards back online

If Infiniband for SX-Aurora TSUBASA is installed on the system:

systemctl start vemmd

Then load the VE driver kernel module and start MMM:

modprobe ve_drv
systemctl start mmm

Or reboot the system.


Wikipedia