Updated: April 21, 2019
Erich Focht
With VEOS 2.0.1 published mid January 2019 we finally get glibc support for VE programs. The glibc based environment should bring some benefits, it is more compatible to Linux and has better and more configurable memory management. We wanted to have glibc right from the start but it turned out that the code was so full of GNU-isms, that ncc could simply not compile it (and it is in good company with other great compilers). A purely scalar gcc port does the job now.
musl-libc based setups will be supported only until March 2019. You will need to recompile your codes after switching to the glibc environment.
VEOS 2.0.1 is only supporting glibc on Centos/RHEL 7.5 installations. Older versions (7.3/7.4) will get glibc support in VEOS 2.0.2 which will be released in February. So if you want to switch now to glibc, you must update to Centos/RHEL 7.5!
The VEOS yum repositories announced in this post have been updated to include the VEOS 2.0.1 packages. You can now easily update from there. This post will repeat some of the steps in that post, for the sake of completeness.
PREPARATION
Make sure you have the public GPG key for VE software imported:
rpm --import https://sx-aurora.com/repos/vesw.public.key
rpm --import https://sx-aurora.com/repos/RPM-GPG-KEY-TSUBASA-soft
Replace your yum repository configurations for VEOS (mostly local repository possibly called /etc/yum.repos.d/TSUBASA-local.repo) by following two repository configuration files:
/etc/yum.repos.d/veos-common.repo:
[veos-common]
name=Aurora VEOS Common
baseurl=https://sx-aurora.com/repos/veos/common
gpgcheck=1
enabled=1
/etc/yum.repos.d/veos-rhel7.repo:
[veos-rhel7]
name=Aurora VEOS RHEL Specific
baseurl=https://sx-aurora.com/repos/veos/RHEL7.5
gpgcheck=1
enabled=1
In the later repo config comment/uncomment the baseurl that corresponds to your variant of RHEL7 or CentOS7.
You should have access to a repository that contains the updated versions of the proprietary packages like SDK, compilers, MPI, NQSV, etc that are adapted to glibc-ve. This can be a local repository, on the same machine, or remote. Please ask your support contact at NEC in case you are not sure or did not receive the updated SDK, yet. The following steps imply that you have a yum repository configuration that points to an updated SDK packages repository.
UPDATING
The steps described below are following the Installation Manual.
Take the VE cards into maintenance mode
/opt/nec/ve/bin/vecmd state set off
/opt/nec/ve/bin/vecmd state set mnt
Stop VEOS, Monitoring and MMM
systemctl stop vemmd
/opt/nec/ve/veos/sbin/terminate-all-veos
systemctl stop mmm
rmmod ve_drv
rmmod vp
Update VEOS
If you are using Infiniband on your system for MPI involving Vector Engines, you must have installed the Mellanox OFED package as well as ve_peermem. You will need to deinstall both before doing the update:
yum remove ve_peermem
/usr/sbin/ofed_uninstall.sh
Now proceed to update VEOS (and possibly CentOS/RHEL 7.5):
yum update
yum group update veos-apprun veos-appdev --disableexcludes=all
NOTE (April, 2019): The original set of instructions recommended to install
# make sure glibc-ve is installed
yum install glibc-ve glibc-ve-devel
This should not be needed any more.
If a user already installed packages using yum install
, then
packages cannot be updated using yum group update
command. Users
will face this issue, when they follow the official installation
guide. To fix it, do:
yum group mark convert veos-apprun veos-appdev mmm
Install the new, glibc based compilers:
yum remove nec-nc++-musl-inst-1.6.0 nec-nfort-musl-inst-1.6.0
yum install nec-nc++-2.0.8 nec-nfort-2.0.8 nec-nc++-inst-2.0.8 nec-nfort-inst-2.0.8
You also might want to install the newer MPI for glibc-ve. At the time of the release of VEOS 2.0.1 the package names contain the version number 2-0-0:
yum install nec-mpi-devel-2-0-0 nec-mpi-libs-2-0-0 nec-mpi-utils-2-0-0
On some systems we had to make sure some RPMs are installed (dependencies are maybe not complete):
yum install libgcc-ve-static libsysve-devel veos-devel
Reboot the system and wait for the nodes to come online.
Install MOFED, if needed
If you were using Infiniband before and had to deinstall Mellanox OFED, you must install now the appropriate version. From http://www.mellanox.com/page/products_dyn?product_family=26 download “Mellanox OFED 4.3-3.0.2.1”, the tarball, not the ISO.
Untar the tarball, enter the directory and type:
./mlnxofedinstall --add-kernel-support --kmp
Install the ve_peermem package that manages memory mappings between IB cards and VEs.
yum install ve_peermem
Update VMC Firmware
If the vmcfw package was updated it is recommended to update the VMC firmware of the VE card. The VE’s FW version can be checked with the command:
/opt/nec/ve/bin/vecmd info | grep VMCFW
If the version differs from the version of the vmcfw RPM, the VE’s FW should be updated while the cards are in maintenance mode and the VEOS related services are still stopped. The procedure is simple, but make sure the updating command is issued from within the directory containing the firmware:
cd /etc/opt/nec/ve/mmm/vmc
/opt/nec/ve/bin/vecmd state set off
/opt/nec/ve/bin/vecmd state set mnt
systemctl stop ve-os-state-monitor@*.service ve-os-launcher@*.service ve-ived
/opt/nec/ve/bin/vecmd fwup vmc aurora_MK10.bin
Reboot the VH after a FW upgrade.
Bring VE cards back online
If Infiniband for SX-Aurora TSUBASA is installed on the system:
systemctl start vemmd
Then load the VE driver kernel module and start MMM:
modprobe ve_drv
systemctl start mmm
Or reboot the system.