2024.09.06 更新,有加速效果
安装环境
deepin 社区版(rc2)(23),lammps-17Apr2024
首先安装kokkos
用spack安装比较简单,
git clone -c feature.manyFiles=true https://github.com/spack/spack.git spack
cd spack/bin/
./spack install kokkos
三行命令搞定。
编译lammps
cd ./lammps-17Apr2024/src
make clean-all
make yes-kokkos
make peachrl_kokkos -j 8
其中MAKE/Makefile.peachrl_kokkos 如下:
# ubuntu = Ubuntu Linux box, g++, openmpi, FFTW3
# you have to install the packages g++, mpi-default-bin, mpi-default-dev,
# libfftw3-dev, libjpeg-dev and libpng12-dev to compile LAMMPS with this
# makefile
SHELL = /bin/sh
# ---------------------------------------------------------------------
# compiler/linker settings
# specify flags and libraries needed for your compiler
CC = mpicxx
CCFLAGS = -g -O3
SHFLAGS = -fPIC
DEPFLAGS = -M
LINK = mpicxx
LINKFLAGS = -g -O3
LIB =
SIZE = size
ARCHIVE = ar
ARFLAGS = -rc
SHLIBFLAGS = -shared
KOKKOS_DEVICES = OpenMP
# ---------------------------------------------------------------------
# LAMMPS-specific settings, all OPTIONAL
# specify settings for LAMMPS features you will use
# if you change any -D setting, do full re-compile after "make clean"
# LAMMPS ifdef settings
# see possible settings in Section 3.5 of the manual
LMP_INC = -DLAMMPS_GZIP -DLAMMPS_JPEG -DLAMMPS_PNG -DLAMMPS_FFMPEG
# MPI library
# see discussion in Section 3.4 of the manual
# MPI wrapper compiler/linker can provide this info
# can point to dummy MPI library in src/STUBS as in Makefile.serial
# use -D MPICH and OMPI settings in INC to avoid C++ lib conflicts
# INC = path for mpi.h, MPI compiler settings
# PATH = path for MPI library
# LIB = name of MPI library
MPI_INC = -DMPICH_SKIP_MPICXX -DOMPI_SKIP_MPICXX=1
MPI_PATH =
MPI_LIB =
# FFT library
# see discussion in Section 3.5.2 of manual
# can be left blank to use provided KISS FFT library
# INC = -DFFT setting, e.g. -DFFT_FFTW, FFT compiler settings
# PATH = path for FFT library
# LIB = name of FFT library
FFT_INC = -DFFT_FFTW3
FFT_PATH =
FFT_LIB = -lfftw3
# JPEG and/or PNG library
# see discussion in Section 3.5.4 of manual
# only needed if -DLAMMPS_JPEG or -DLAMMPS_PNG listed with LMP_INC
# INC = path(s) for jpeglib.h and/or png.h
# PATH = path(s) for JPEG library and/or PNG library
# LIB = name(s) of JPEG library and/or PNG library
JPG_INC =
JPG_PATH =
JPG_LIB = -ljpeg -lpng
# library for loading shared objects (defaults to -ldl, should be empty on Windows)
# uncomment to change the default
# override DYN_LIB =
# ---------------------------------------------------------------------
# build rules and dependencies
# do not edit this section
include Makefile.package.settings
include Makefile.package
EXTRA_INC = $(LMP_INC) $(PKG_INC) $(MPI_INC) $(FFT_INC) $(JPG_INC) $(PKG_SYSINC)
EXTRA_PATH = $(PKG_PATH) $(MPI_PATH) $(FFT_PATH) $(JPG_PATH) $(PKG_SYSPATH)
EXTRA_LIB = $(PKG_LIB) $(MPI_LIB) $(FFT_LIB) $(JPG_LIB) $(PKG_SYSLIB) $(DYN_LIB)
EXTRA_CPP_DEPENDS = $(PKG_CPP_DEPENDS)
EXTRA_LINK_DEPENDS = $(PKG_LINK_DEPENDS)
# Path to src files
vpath %.cpp ..
vpath %.h ..
# Link target
$(EXE): main.o $(LMPLIB) $(EXTRA_LINK_DEPENDS)
$(LINK) $(LINKFLAGS) main.o $(EXTRA_PATH) $(LMPLINK) $(EXTRA_LIB) $(LIB) -o $@
$(SIZE) $@
# Library targets
$(ARLIB): $(OBJ) $(EXTRA_LINK_DEPENDS)
@rm -f ../$(ARLIB)
$(ARCHIVE) $(ARFLAGS) ../$(ARLIB) $(OBJ)
@rm -f $(ARLIB)
@ln -s ../$(ARLIB) $(ARLIB)
$(SHLIB): $(OBJ) $(EXTRA_LINK_DEPENDS)
$(CC) $(CCFLAGS) $(SHFLAGS) $(SHLIBFLAGS) $(EXTRA_PATH) -o ../$(SHLIB) \
$(OBJ) $(EXTRA_LIB) $(LIB)
@rm -f $(SHLIB)
@ln -s ../$(SHLIB) $(SHLIB)
# Compilation rules
%.o:%.cpp
$(CC) $(CCFLAGS) $(SHFLAGS) $(EXTRA_INC) -c $<
# Individual dependencies
depend : fastdep.exe $(SRC)
@./fastdep.exe $(EXTRA_INC) -- $^ > .depend || exit 1
fastdep.exe: ../DEPEND/fastdep.c
cc -O -o $@ $<
sinclude .depend
命令
只需要在开始计算时在命令中加入-sf kk
,相当于自动在能够加速的地方加上了“/kk”,不需要修改脚本。
mpirun -np 8 lmp4kk -k on t 1 -sf kk -in in.script
不过,好像对我的计算根本没有加速效果诶。。?
2024.09.06 更新 解决warning【OMP_PROC_BIND】
注意到一个warning:
Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set
In general, for best performance with OpenMP 4.0 or better set OMP_PROC_BIND=spread and OMP_PLACES=threads
For best performance with OpenMP 3.1 set OMP_PROC_BIND=true
For unit testing set OMP_PROC_BIND=false
手册里有说明,于是按照手册另一处的说明调整了一下Makefile(这种是不适用于多线程的,只能t=1):
# ubuntu = Ubuntu Linux box, g++, openmpi, FFTW3
# you have to install the packages g++, mpi-default-bin, mpi-default-dev,
# libfftw3-dev, libjpeg-dev and libpng12-dev to compile LAMMPS with this
# makefile
SHELL = /bin/sh
# ---------------------------------------------------------------------
# compiler/linker settings
# specify flags and libraries needed for your compiler
CC = mpicxx
CCFLAGS = -g -O3
SHFLAGS = -fPIC
DEPFLAGS = -M
LINK = mpicxx
LINKFLAGS = -g -O3
LIB =
SIZE = size
ARCHIVE = ar
ARFLAGS = -rc
SHLIBFLAGS = -shared
KOKKOS_DEVICES = Pthread
KOKKOS_USE_TPLS=hwloc
# or KOKKOS_USE_TPLS=libnuma
# ---------------------------------------------------------------------
# LAMMPS-specific settings, all OPTIONAL
# specify settings for LAMMPS features you will use
# if you change any -D setting, do full re-compile after "make clean"
# LAMMPS ifdef settings
# see possible settings in Section 3.5 of the manual
LMP_INC = -DLAMMPS_GZIP -DLAMMPS_JPEG -DLAMMPS_PNG -DLAMMPS_FFMPEG
# MPI library
# see discussion in Section 3.4 of the manual
# MPI wrapper compiler/linker can provide this info
# can point to dummy MPI library in src/STUBS as in Makefile.serial
# use -D MPICH and OMPI settings in INC to avoid C++ lib conflicts
# INC = path for mpi.h, MPI compiler settings
# PATH = path for MPI library
# LIB = name of MPI library
MPI_INC = -DMPICH_SKIP_MPICXX -DOMPI_SKIP_MPICXX=1
MPI_PATH =
MPI_LIB =
# FFT library
# see discussion in Section 3.5.2 of manual
# can be left blank to use provided KISS FFT library
# INC = -DFFT setting, e.g. -DFFT_FFTW, FFT compiler settings
# PATH = path for FFT library
# LIB = name of FFT library
FFT_INC = -DFFT_FFTW3
FFT_PATH =
FFT_LIB = -lfftw3
# JPEG and/or PNG library
# see discussion in Section 3.5.4 of manual
# only needed if -DLAMMPS_JPEG or -DLAMMPS_PNG listed with LMP_INC
# INC = path(s) for jpeglib.h and/or png.h
# PATH = path(s) for JPEG library and/or PNG library
# LIB = name(s) of JPEG library and/or PNG library
JPG_INC =
JPG_PATH =
JPG_LIB = -ljpeg -lpng
# library for loading shared objects (defaults to -ldl, should be empty on Windows)
# uncomment to change the default
# override DYN_LIB =
# ---------------------------------------------------------------------
# build rules and dependencies
# do not edit this section
include Makefile.package.settings
include Makefile.package
EXTRA_INC = $(LMP_INC) $(PKG_INC) $(MPI_INC) $(FFT_INC) $(JPG_INC) $(PKG_SYSINC)
EXTRA_PATH = $(PKG_PATH) $(MPI_PATH) $(FFT_PATH) $(JPG_PATH) $(PKG_SYSPATH)
EXTRA_LIB = $(PKG_LIB) $(MPI_LIB) $(FFT_LIB) $(JPG_LIB) $(PKG_SYSLIB) $(DYN_LIB)
EXTRA_CPP_DEPENDS = $(PKG_CPP_DEPENDS)
EXTRA_LINK_DEPENDS = $(PKG_LINK_DEPENDS)
# Path to src files
vpath %.cpp ..
vpath %.h ..
# Link target
$(EXE): main.o $(LMPLIB) $(EXTRA_LINK_DEPENDS)
$(LINK) $(LINKFLAGS) main.o $(EXTRA_PATH) $(LMPLINK) $(EXTRA_LIB) $(LIB) -o $@
$(SIZE) $@
# Library targets
$(ARLIB): $(OBJ) $(EXTRA_LINK_DEPENDS)
@rm -f ../$(ARLIB)
$(ARCHIVE) $(ARFLAGS) ../$(ARLIB) $(OBJ)
@rm -f $(ARLIB)
@ln -s ../$(ARLIB) $(ARLIB)
$(SHLIB): $(OBJ) $(EXTRA_LINK_DEPENDS)
$(CC) $(CCFLAGS) $(SHFLAGS) $(SHLIBFLAGS) $(EXTRA_PATH) -o ../$(SHLIB) \
$(OBJ) $(EXTRA_LIB) $(LIB)
@rm -f $(SHLIB)
@ln -s ../$(SHLIB) $(SHLIB)
# Compilation rules
%.o:%.cpp
$(CC) $(CCFLAGS) $(SHFLAGS) $(EXTRA_INC) -c $<
# Individual dependencies
depend : fastdep.exe $(SRC)
@./fastdep.exe $(EXTRA_INC) -- $^ > .depend || exit 1
fastdep.exe: ../DEPEND/fastdep.c
cc -O -o $@ $<
sinclude .depend
这样编译的话,似乎是有加速的?对于19600个原子的聚酰亚胺壁面在300K弛豫一段时间:
mpirun -np 8 lmp4kkh -k on t 1 -sf kk -in in.script | mpirun -np 8 lmp4kkh -in in.script |
---|---|
"Performance: 0.082 ns/day, 291.026 hours/ns, 3.818 timesteps/s, 74.831 katom-step/s, 98.8% CPU use with 8 MPI tasks x 1 OpenMP threads" | "Performance: 0.062 ns/day, 389.210 hours/ns, 2.855 timesteps/s, 55.954 katom-step/s, 99.7% CPU use with 8 MPI tasks x no OpenMP threads" |