日文分词 Mecab 和 PHP7/PHP8 的 extension 扩展

日文分词

Install

# 分词组件
cd /opt
wget https://distfiles.macports.org/mecab/mecab-0.996.tar.gz
wget https://distfiles.macports.org/mecab/mecab-ipadic-2.7.0-20070801.tar.gz
tar xvzf mecab-0.996.tar.gz
cd mecab-0.996
./configure --enable-utf8-only --enable-mutex 
### エラーが出る場合は ./configure --with-charset=utf8 --enable-utf8-only のオプションに変更
make && make install

tar xvzf mecab-ipadic-2.7.0-20070801.tar.gz
cd mecab-ipadic-2.7.0-20070801
./configure --with-charset=utf8
make && make install

# PHP扩展
## PHP 7.x
git clone https://github.com/rsky/php-mecab 
cd php-mecab/mecab 
phpize
./configure --with-mecab=/usr/local/bin/mecab-config 
make && make install

## PHP 8.x
git clone https://github.com/ranvis/php-mecab.git
cd php-mecab/mecab 
phpize
./configure --with-mecab=/usr/local/bin/mecab-config 
make && make install

# vi /etc/php.ini
extension=mecab.so

Using

$str = "すもももももももものうち";
$t = new \MeCab\Tagger();
$m = explode("\n", $t->parse($str));

常见问题

make
/bin/sh /usr/local/src/php-mecab/mecab/libtool --mode=link cc -DPHP_ATOM_INC -I/usr/local/src/php-mecab/mecab/include -I/usr/local/src/php-mecab/mecab/main -I/usr/local/src/php-mecab/mecab -I/usr/include/php -I/usr/include/php/main -I/usr/include/php/TSRM -I/usr/include/php/Zend -I/usr/include/php/ext -I/usr/include/php/ext/date/lib -I/usr/local/include  -DHAVE_CONFIG_H  -g -O2  -L/usr/lib64 -o mecab.la -export-dynamic -avoid-version -prefer-pic -module -rpath /usr/local/src/php-mecab/mecab/modules  mecab7.lo -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lmecab -lstdc++
libtool: link: cc -shared  .libs/mecab7.o   -Wl,-rpath -Wl,/usr/local/lib -Wl,-rpath -Wl,/usr/local/lib/../lib -Wl,-rpath -Wl,/usr/local/lib -Wl,-rpath -Wl,/usr/local/lib/../lib -L/usr/lib64 -L/usr/local/lib /usr/local/lib/libmecab.so -lpthread /usr/local/lib/../lib/libstdc++.so -lm  -Wl,-rpath -Wl,/usr/local/lib   -Wl,-soname -Wl,mecab.so -o .libs/mecab.so
/usr/local/lib/../lib/libstdc++.so: could not read symbols: File in wrong format
collect2: ld returned 1 exit status
configure のオプションを指定

# LDFLAGS=-L/usr/local/lib64 ./configure
# make
# make install
/etc/ld.so.conf 見てくれない時などは上記のオプションが便利
インストール後に mecab.so ができていることを確認する

# make install
Installing shared extensions:     /usr/lib64/php/modules/
# ls /usr/lib64/php/modules/
bz2.so       exif.so      gmp.so       mecab.so    pdo_mysql.so   simplexml.so  sysvsem.so    xdebug.so
calendar.so  fileinfo.so  iconv.so     mysqli.so   pdo_sqlite.so  soap.so       sysvshm.so    xml.so
ctype.so     ftp.so       json.so      mysqlnd.so  phar.so        sockets.so    tidy.so       xmlreader.so
curl.so      gd.so        mbstring.so  opcache.so  posix.so       sqlite3.so    tokenizer.so  xmlwriter.so
dom.so       gettext.so   mcrypt.so    pdo.so      shmop.so       sysvmsg.so    wddx.so  

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注