-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: craete a script to ease the process of i18n by make i18n
#1991
Conversation
Thanks for the pull request, @ghassanmas! I've created OSPR-6206 to keep track of it in JIRA, where we prioritize reviews. Please note that it may take us up to several weeks or months to complete a review and merge your PR. Feel free to add as much of the following information to the ticket as you can:
All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here. Please let us know once your PR is ready for our review and all tests are green. |
@feanil following our discussing in the last translation working group, I have created this PR |
@ghassanmas Thank you for your contribution. @nedbat @feanil Would either of you would like to review? |
21f70fe
to
94edab0
Compare
The latest commits this 85bf3e9 and this 94edab0, are made to automate the process of pushing resoucres to transfix , as was propsed by @ehuthmacher (Eden Huthmacher ) during our call today 7 Dec. Note: In order to run, it's important that somone with access to github env, to add tx token (which has permission to push resouces).
|
Hi @ghassanmas - I spoke with @feanil and we have some major docs work coming up, and are currently not that versed in the structure of the docs as-is. We need to pull apart edX-specific docs from Open edX-specific docs, for example. The repo may change significantly. As such I'm not exactly sure what to do with this PR, as it seems potentially premature - and it wouldn't be great to work on translating docs just to see them change out from under you. What are your thoughts? |
@sarina, if I got you right there are two things that can be changed from these docs:
If you are referring to changes of B.i then thus already be taken care of as what @feanil suggested, hence: https://github.com/ghassanmas/edx-documentation/blob/94edab0c17f5cf8e8945041d80c703524e57595a/i18n.sh#L38-L42 . But may still not clear about the content? If you are referring to changes of B.ii then yeah I would totally agree it doesn't make sense to ask translators to translate and then after a short period of time the whole content or large part of the content is expected to be changed. If you are referring to changes of A then it would be okay to rewrite the i18n.sh (Since I don't expect changes of code would affect changes of content). I would just optimize for giving the opportunity for community members to have translatable docs ASAP instead of them having to wait until U2 and open edx decided what content to be changed...etc But of course, you might have more input about how long this process is gonna take, Also regarding the structure, I would say it would maybe better to have just one sphinx project instead of having multiple projects, it would just make the process of generating i18n strings simpler and most importantly less prone to errors hence:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ghassanmas I didn't see the limiting to only 3 books under here. Thanks for pointing that out. I gave you an initial review but have a few questions because I'm not very familiar with our translations infrastructure.
Also, to confirm, the TX_TOKEN
is not currently defined, is that right? Can we make it so that translations sync won't run if that setting is missing?
.github/workflows/build.yml
Outdated
TX_TOKEN: ${{ secrets.TX_TOKEN }} | ||
run: | | ||
./i18n.sh | ||
tx push -s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add trailing newline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's totally doable I would just add that to run when TX_TOKEN
is missing and the running env is github, because if run locally there would be different way for transfiex to check for authtincaiion.
pyjwt[crypto]==1.7.1 | ||
# elasticsearch>=7.14.0 includes breaking changes in it which caused issues in discovery upgrade process. | ||
# elastic search changelog: https://www.elastic.co/guide/en/enterprise-search/master/release-notes-7.14.0.html | ||
elasticsearch<7.14.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this in the constraints file? Doesn't seem relevant to edx-documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was generated by make upgrade
because I added new dependences
.tx/config
Outdated
[main] | ||
host = https://www.transifex.com | ||
|
||
[open-edx-documentation-project.OpenSFD_certificates] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with how this configuration is setup, can you provide some context?
- What does
OpenSFD
stand for? - Why are there so many resource entries in this file? Is it one per RST file?
- Was this generated in some way? If so, what tool or process was used to generate it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does OpenSFD stand for?
It's this rst file https://github.com/edx/edx-documentation/blob/master/en_us/open_edx_students/source/OpenSFD_certificates.rst to be more accuarte it's OpenSFD_cerficates
Why are there so many resource entries in this file? Is it one per RST file?
Yes there is one resoucre per rst file from each project and also from the shared proejct
Was this generated in some way? If so, what tool or process was used to generate it?
That exactly what does i18n.sh
does, it suses sphinx-intl
tool to generate those file for each project (the open edx projects).
I tried to dcoument each step in the file. I used sphinx-intl
because the sphinx
is what is used here to handle doc also as suggested by @jmbowman in this trello ticket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that file is generated, does it make sense to check it in? Wouldn't it be better to generate it when we need it? Otherwise, we'll need to ensure that anyone that adds a new rst file will also need to know to re-generate the config file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it make sense to remove it given also the now github workflow will generate it anyway...
2482d37
to
c87bd5e
Compare
@feanil I have pushed a commit to reolsve your suggestion, let me know if there anything I need to do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking that we're closing enough to moving these repos over, that we may want to hold merging this until it's moved over to the openedx
org but I think it's starting to make sense to me as a starting point and it's pretty close to being ready to merge from my perspective.
.github/workflows/build.yml
Outdated
|
||
- name : "Translations Sync" | ||
env: | ||
TX_TOKEN: ${{ secrets.TX_TOKEN }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make it clear which token this is, rename it to TRANSIFEX_TOKEN
Can you also add docs for what permissions the token user needs to successfully run this script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean rename it in the secret or in the env
. Because I am afraid for env
it has to be TX_TOKEN
as per transifex doc. Also tx
is cli name for transifex, so it should be okay to assume future developer be famiiliar with tx
.
Yes I can add doc about what the permission of token needed to be, I will refer to this doc, simply it's as per tx doc the one or the user generating the token should have permission of project maintaer or administoraor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's great, linking to the doc and adding the one line that repeats the critical info is a good improvement. That way if the doc moves we still have the info and if the doc does not move, we can always get more details if needed.
I think it's fine to leave it as TX_TOKEN, but we'll need to add this as an org secret and before this job will succeed.
.github/workflows/build.yml
Outdated
|
||
- name : "Translations Sync" | ||
env: | ||
TX_TOKEN: ${{ secrets.TX_TOKEN }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's great, linking to the doc and adding the one line that repeats the critical info is a good improvement. That way if the doc moves we still have the info and if the doc does not move, we can always get more details if needed.
I think it's fine to leave it as TX_TOKEN, but we'll need to add this as an org secret and before this job will succeed.
.github/workflows/build.yml
Outdated
@@ -38,3 +38,11 @@ jobs: | |||
- name: "Build docs" | |||
run: | | |||
./run_tests.sh | |||
|
|||
- name : "Translations Sync" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking more at this workflow, currently this workflow just builds the docs as a part of the testing so this build.yml is a part of CI, what we're doing is adding a release step. If we leave it as is, it will also push to transifex anytime a PR is made which is not what we want. We should only push on merges to master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just added a condition on the translation step, which should stop it from running unless it's a merging request, hence the build succeed because its a PR and the tx token no longer is releavnt! and Merry Christmas 🎅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ghassanmas I don't think that's quite right either, because then it will run on any PR merging to any branch won't it?
I think the way I'd like to do this is to have just a separate github action for building and testing from the action to push translation strings to transifex. Can you create a separate github action(push_translation_strings.yml
) which runs on push
to master
?
c36b9fc
to
c42a4a9
Compare
@feanil I have just wrapped up the commit log into one. And the building tests behaved as expected with the openedx migration (translation script have been skipped). |
.github/workflows/build.yml
Outdated
@@ -38,3 +38,11 @@ jobs: | |||
- name: "Build docs" | |||
run: | | |||
./run_tests.sh | |||
|
|||
- name : "Translations Sync" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ghassanmas I don't think that's quite right either, because then it will run on any PR merging to any branch won't it?
I think the way I'd like to do this is to have just a separate github action for building and testing from the action to push translation strings to transifex. Can you create a separate github action(push_translation_strings.yml
) which runs on push
to master
?
@feanil It will run on push to master or open release./master as specified here edx-documentation/.github/workflows/build.yml Lines 7 to 9 in 52742f7
I can do that by creating a separate workflow file..etc for it anyway (for separation of concern), but anyway, do we want it to push translation each time push to master happens? because I sensed from @sarina that we better push on each release? |
@feanil not sure if you were there in standup, but yes, each push to master has the potential to wipe a whole page of translations. so we'd want this to somehow push only when a named release branch is made (either as part of release tooling or automatically when a tag is created - however because creating a new named release branch would require a new transifex project, we'd likely need to do it manually) |
Right, yes, thank you for the reminder, it slipped my mind. We should only do it on pushes to the release branches but I think it still makes sense to have it as a separate action. |
So I should create a github action than be triggerd manually i.e. |
No it should run on merges(push) to the open.release/*.master branches.
…On Tue, Jan 11, 2022 at 12:18 PM Ghassan Maslamani ***@***.***> wrote:
So I should create a github action than be triggerd manually i.e. on:
workflow_dispatch ref:
https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow#configuring-a-workflow-to-run-manually
—
Reply to this email directly, view it on GitHub
<#1991 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAF6Z6NZCCY4KMZC7FFTZRTUVRQ7RANCNFSM5HLN35UQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
- The script runs on new releases, will generate .tx directory. - .tx contains transifex config files of *specific* edx-doc projects, the files will be used by tx-client in order to push resources. - Both sphinx-intl and tx-client libraries are being used and thus are added to the dependency. - TX_TOKEN is expected to be available in the env with push permission, for more info refer to: https://docs.transifex.com/client/client-configuration - Related ticket: https://trello.com/c/q4JJOHeo
c42a4a9
to
50443a9
Compare
@feanil done |
How do you make sure it automatically goes to the correct project? I envision that the lilac.master branch creates a Lilac project in transifex, the maple.master branch a Maple project, etc (my Transifex terms may be wrong - I don't actually know how to create separate named branches within a Transifex repo/project). |
@sarina the i18n.sh will always push to the same project in tx which is open-edx-documentation-project ref The thing is if we pushed to a new project each time a new release happens then isn't the translation process needed to be started from scratch all over again? Regarding what happens when a file is a resource file being pushed again or updated, its that TX will work out which strings have been changed in that file and a new string for translation accordingly ref The thing that hasn't been thought of is when and how should the translation for each release will be pulled, should this repo try for example through a GitHub action to pull the translation from tx on e.g. weekly basis? |
@ghassanmas this is exactly what I've been saying needs some serious investigation/thought. Right now, people are running on instances 1 or 2 or even 6 named releases behind the current named release. We need a way of maintaining translations for previous named releases, not just the most current. |
If it's critical to keep maintaining translation for multiple releases then to my knowlege of tx we have two ways to resolve this
I think both are doable, however, if we choose (1) and do it the normal way, then translators might be frustrated because it would require to rereview or retranslate the whole project again. But we could do it in a handy way such that when instead of creating an empty project (for the new release) we could copy or clone the translation of the last release and then push the new resource files, TX mentioned this here and even refer to a python script that can be utilized for that. For (2) I am not sure about that for now, because it would affect the translation experience like how many resources will end be in the project? Also, it might be complicated if for example for a new release new rst files are being added |
You might go to the Transifex working group and ask the translators, but I
_very strongly_ suspect they'll want translated docs for a specific named
release to be available - and available to be updated - for a few years. So
I think we should get ahead of that problem. And figure out how to preserve
translation memory.
Right now, I am able to spell out requirements but not investigate the
solution space. It might be good to see if there's anyone else in the
community with Transifex knowledge to help you figure out the best course
of action.
…On Thu, Jan 13, 2022 at 2:30 PM Ghassan Maslamani ***@***.***> wrote:
If it's critical to keep maintaining translation for multiple releases
then to my knowlege of tx we have two ways to resolve this
1. Have a new project for each release i.e. "Open edx documentation
project "
2. Have a new translation file for each release i.e. "readme..po"
I think both are doable, however, if we choose (1) and do it the normal
way, then translators might be frustrated because it would require to
rereview or retranslate the whole project again. But we could do it in a
handy way such that when instead of creating an empty project (for the new
release) we could copy or clone the translation of the last release and
then push the new resource files, TX mentioned this here
<https://docs.transifex.com/projects/moving-content> and even refer to a
python script that can be utilized for that.
For (2) I am not sure about that for now, because it would affect the
translation experience like how many resources will end be in the project?
Also, it might be complicated if for example for a new release new rst
files are being added
—
Reply to this email directly, view it on GitHub
<#1991 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAPEWJK3GT6742RZH5VVYVLUV4R45ANCNFSM5HLN35UQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
*Sarina Canelake* | Pronunciation <https://namedrop.io/sarinacanelake>
Engineering Manager
The Center For Reimagining Learning
Pronouns: She/Her/Hers
141 Portland Street, 9th Floor, Cambridge, MA 02139
|
Can you outline the requirements explicitly again? Is it just to save the transaltion per release? |
@ghassanmas based on the conversations here and in a few other places, I wrote up #2018 to clearly define what we want done around translations. While I think the tooling discovery you've done in this PR has been really useful, I think we need to have a much better understanding about how we're going to manage the translations process around documentation before we merge this and create a bunch of translation work that we may have to throw away. Thanks for this first attempt at this, it really helps clarify what questions we need answered before we can proceed with having a good documentation translation process but I don't think we should merge this as is. Let's talk more about answering the questions raised from #2018 before we land the tooling to start pushing things up to transifex. |
@ghassanmas Even though your pull request wasn’t merged, please take a moment to answer a two question survey so we can improve your experience in the future. |
The
make i18n
script once run, will generate both pot files andtransifex config file, to east the process of translating
the specific edx-doc projects. before running the script, you need to install
sphinx-intl via
pip install sphinx-intl
Related ticket: https://trello.com/c/q4JJOHeo
Please note:
This might generate duplicate strings, because the three projects "open_edx_students", "open_edx_course_authors", and "en_us/open_edx_release_notes" I assume would share strings or resources from "en_us/shared", so it would be important to figure out if there would be a way to not have duplicate or at least to let translators know about it!