擴展:維基語音/安裝語音

This page is a translated version of the page Extension:Wikispeech/Installing Speechoid and the translation is 49% complete.

Speechoid is the text-to-speech backend of Wikispeech. It consists of a number of services controlled via Wikispeech-server. The Speechoid build pipeline use Blubber to create Docker images which is meant to be deployed on for instance a Kubernetes cluster, but there is also a Compose project for easy deployment on development servers and small installations.

Please report any errors you encounter in the discussion of this wikipage.

預建鏡像

All the Speechoid services are a part of the Wikimedia CI pipeline, and all builds are available in the Wikimedia docker registry. You probably don't need to build your own images.

建築圖片

參選資格

This guide is based on the setup used by the Wikispeech development team that use Ubuntu (18, 19 and 20) as workstations and Debian (10) server side.

  • Docker
  • Blubber - We strongly recommend using the prebuilt binary. Make sure it's available in your $PATH, e.g. by copying it to /usr/local/bin.

使用配置腳本

The docker-compose project comes with a script that will download and build all services for you.

git clone https://github.com/Wikimedia-Sverige/wikispeech-speechoid-docker-compose.git cd wikispeech-speechoid-docker-compose ./create-all-images.sh

手動構建鏡像

Each service lives in its own git repository at gerrit.wikimedia.org:

git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/mary-tts" git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/mishkal" git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/pronlex" git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/symbolset" git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/wikispeech-server"

Each service contains a Blubber-helper script that prepare the image.

cd some-service
./blubber-build.sh

開始請求

使用撰寫

The easiest way is to spin it up using Docker compose:

git clone https://github.com/Wikimedia-Sverige/wikispeech-speechoid-docker-compose cd wikispeech-speechoid-docker-compose ./run.sh

If you built your own images you'll need up update the compose file to correspond with your image build tags.

After a little while Wikispeech server should be available as an HTTP service on port 10000.

手動啟動語音

As of writing this documentation, there is no ready to go pipeline for starting up in other environments. Speechoid will, when ready for production, be deployed on a Kubernetes cluster.

在雲VPS上設置實例

These instructions are for Debian 10 on WMF cloud VPS services, but should be rather generic and reusable for most environments.

Clone and prepare Speechoid docker-compose

Install Docker Compose following the instructions on https://docs.docker.com/compose/install/.

sudo su mkdir /etc/docker-compose cd /etc/docker-compose git clone https://github.com/Wikimedia-Sverige/wikispeech-speechoid-docker-compose.git speechoid cd speechoid mkdir -p volumes/wikispeech_mockup_tmp chmod a+rwx volumes/wikispeech_mockup_tmp

設置為系統服務

The following will automatically start Speechoid on boot.

# sudo su
# cat << EOF > /etc/systemd/system/docker-compose\@.service
[Unit]
Description=docker-compose %i service
Requires=docker.service network-online.target
After=docker.service network-online.target

[Service]
WorkingDirectory=/etc/docker-compose/%i
Type=simple
TimeoutStartSec=15min
Restart=always
ExecStart=/usr/bin/docker compose up
ExecStop=/usr/bin/docker compose down

[Install]
WantedBy=multi-user.target
EOF

# systemctl enable docker-compose@speechoid
# systemctl start docker-compose@speechoid

設置網絡代理

現在,您需要在端口 10000 上設置到 Speechoid 的 HTTP 代理傳遞。 Alternatively you could setup a floating IP, but that isn't covered by this guide.

Go to https://horizon.wikimedia.org/project/proxy/ and click the Create Proxy-button. Fill in the information about your VPS and use backend port 10000. Speechoid should now be publicly available as HTTP and HTTPS on port 80 and 443.

To enable transcription preview on Special:EditLexicon, the Symbolset server also needs to be accessible. 此操作如上所述,但端口為 8771。

Pronlex在MariaDB

Out of the box, Pronlex will run on a build in SQLite database. 還支持MariaDB / MySQL。

如果您沒有安裝 MariaDB:

sudo apt -y update
sudo apt -y install software-properties-common gnupg2
sudo apt -y upgrade
sudo reboot

Setup your favorite mirror: (I used Kenya for no reason. You might want something more local or more trusted.)

sudo apt-key adv --recv-keys --keyserver keyserver.ubuntu.com 0xF1656F24C74CD1D8 sudo add-apt-repository 'deb [arch=amd64] http://mariadb.mirror.liquidtelecom.com/repo/10.4/debian buster main' sudo apt update

安裝MariaDB.

sudo apt install mariadb-server mariadb-client

安全,但不要設置 root 密碼。 Pronlex expects this (?! really ?!) and will create a user and databases further down.

sudo mysql_secure_installation

安裝和設置 GoLang。

sudo apt install build-essential gcc cd ~ mkdir opt wget https://dl.google.com/go/go1.13.linux-amd64.tar.gz -O /tmp/go1.13.linux-amd64.tar.gz cd opt tar xvfz /tmp/go1.13.linux-amd64.tar.gz export GOROOT=~/opt/go export GOPATH=~/opt/goProjects export PATH=${GOPATH}/bin:${GOROOT}/bin:${PATH}

Install and build Pronlex.

cd ~ git clone https://github.com/wikimedia/mediawiki-services-wikispeech-pronlex.git cd ~/mediawiki-services-wikispeech-pronlex go build ./...

Populate MariaDB with Lexdata.

cd ~ git clone https://github.com/stts-se/wikispeech-lexdata.git /bin/bash scripts/import.sh -a ~/appdir -e mariadb -l 'speechoid:@tcp(127.0.0.1:3306)' -f ~/wikispeech-lexdata

Profit!

(You might want to set a password for the MariaDB user speechoid at this point)

(You might want to create a user for remote access for the MariaDB user speechoid using the same password and grants at this point)

(You can now delete all GoLang, Pronlex and Lexdata folders if you want.)

Start the Prolex in your Speechoid pointing it at this MariaDB. Something like:

/bin/bash scripts/start_server.sh -a /srv/appdir -e mariadb -l 'speechoid:password@tcp(wikispeech-tts-pronlex:3306)'

If you are running from docker-compose, you simply uncomment and update the environment variable PRONLEX_MARIADB_URI under section pronlex in the compose file.