Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add table and model for memes from users #5

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

aleksspevak
Copy link
Contributor

@aleksspevak aleksspevak commented Dec 30, 2023

Пояснение к таблице meme_raw_upload:

class MemeUserUpload(CustomModel):
    message_id: int
    chat: dict

    content: str | None = None
    date: datetime

    out_links: list[str] | None = None
    mentions: list[str] | None = None # mentioned usernames
    hashtags: list[str] | None = None
    forwarded: dict | None = None
    
    image: list[dict] | None = None # по факту не лист из тех данных, что пришли, а dict но тут нужно еще посмотреть больше с несколькими картинками и видео
    video: list[dict] | None = None
    
meme_raw_upload = Table(
    "meme_raw_upload",
    metadata,
    Column("id", Integer, Identity(), primary_key=True),
    Column("message_id", Integer, nullable=False),
    # from message_id
    # type int not null
    # Example 17, 20 ...
    Column("chat", JSONB, nullable=False),
    # from chat
    # type jsonb not null
    # Example Chat(first_name='', id=, type=<ChatType.PRIVATE>, username=''),
    # first_name как будто не очень нужно, а остальное я бы оставил

    Column("content", String),
    # from caption
    # type varchar null
    # Example текст, текст ...
    Column("date", DateTime, nullable=False),
    # from date
    # type datetime not null
    # Example datetime.datetime(2023, 12, 29, 19, 25, 5, tzinfo=<UTC>)

    Column("out_links", JSONB),
    # from caption_entities where type MessageEntityType.TEXT_LINK
    # type jsonb null
    # Example [https://t.me/ffmemesbot?start=sc_267689, https://huggingface.co/spaces/badayvedat/LLaVA]
    Column("mentions", JSONB),
    # Не было примеров, но может нужно 
    # скорее всего будет в caption_entities с каким-то типом
    # type jsonb null
    Column("hashtags", JSONB),
    # from caption_entities where type MessageEntityType.HASHTAG get length and offset ->parsing caption
    # type jsonb null
    # Example [#meme]
    Column("forwarded", JSONB),
    # from api_kwargs.forward_origin when resend to bot
    # type jsonb null
    # Examples: {'type': 'channel', 'chat': {
    # 					'id': ,
    # 					'title': 'Fast Food Memes / ffmemes',
    # 					'username': 'fastfoodmemes',
    # 					'type': 'channel'
    # 					},
    # 			  'message_id': 8118,
    # 			  'date': 1703875067
    # 		     }
    # 		     {'type': 'hidden_user', 'sender_user_name': '', 'date': 1703853440}}
    #            {'type': 'user', 'sender_user': {'id': , 'is_bot': False, 'first_name': ''}, 'date': 1703877450}

    Column("media", JSONB),
    # from photo я бы взял с наибольшим height+width один dict PhotoSize, нет примера с двумя картинками
    # type jsonb null
    # Examples:
    # photo=(
	#	PhotoSize(file_id='QADNAQ', file_size=1446, file_unique_id='G00eYUh9', height=90, width=58),
    #	PhotoSize(file_id='QADNAQ', file_size=19393, file_unique_id='G00eYUh9', height=320, width=206),
    #	PhotoSize(file_id='QADNAQ', file_size=72237, file_unique_id='G00eYUh9', height=800, width=516),
    #	PhotoSize(file_id='QADNAQ', file_size=88190, file_unique_id='G00eYUh9-', height=1080, width=696)
	#	)
    # from video много данных, основные как будто все строчки кроме api_kwargs и thumbnail, в примере с двумя видео данные об одном
    # type jsonb null
    # Examples:
    # video=Video(
    # 	api_kwargs={
    # 		'thumb': {
    # 			'file_id': 'BwEAB20AAzQE',
    # 			'file_unique_id': 'A',
    # 			'file_size': 11829,
    # 			'width': 175,
    # 			'height': 320
    # 		 }
    # 	},
    # 	duration=21,
    # 	file_id='gTXqpPgc0BA',
    # 	file_name='IMG_2990.MP4',
    # 	file_size=2688663,
    # 	file_unique_id='F',
    # 	height=848,
    # 	mime_type='video/mp4',
    # 	thumbnail=PhotoSize(file_id='BwEAB20AAzQE', file_size=11829, file_unique_id='BwEAB20AAzQE', height=320, width=175),
    # 	width=464)
    Column("created_at", DateTime, server_default=func.now(), nullable=False),
    Column("updated_at", DateTime, onupdate=func.now())

@aleksspevak aleksspevak requested a review from ohld December 30, 2023 11:58
@aleksspevak aleksspevak self-assigned this Dec 30, 2023
@ohld
Copy link
Member

ohld commented Jan 2, 2024

Пока что нет полного понимания, что это то, что нужно. Сохраним как черновик, пригодится.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants