Page 10 of 23 FirstFirst ... 8910111220 ... LastLast
Results 91 to 100 of 222

Thread: Notes tests, Scrapping, YouTube

  1. #91
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,303
    Rep Power
    10

    Scrap YouTube, its shit

    This is post https://www.excelfox.com/forum/showt...rapping/page10
    https://excelfox.com/forum/showthrea...ll=1#post19700
    https://http://www.excelfox.com/forum/showth...rapping/page10
    https://excelfox.com/forum/showthread.php/2840-Notes-tests-Scrapping?p=19700&viewfull=1#post19700






    In support of this main Forum thread
    https://eileenslounge.com/viewtopic....303644#p303644
    In particular latest post : https://eileenslounge.com/viewtopic....303704#p303704




    First quick working attempt
    Last edited by DocAElstein; 01-25-2023 at 02:56 PM.

  2. #92
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,303
    Rep Power
    10


    In support of this main Forum thread
    https://eileenslounge.com/viewtopic....303644#p303644



    First quick working attempt

    Some brief notes of what I did, problems etc.
    _ In the long play list I looked at it seems you only get a text file of all the stuff I want for a bit more than 75 videos at a time. This makes sense and ties up with the experience when you view manually in real time: The scroll box only goes up to on average a bit over the first 75.


    Scrapping that, or rather to say, playing around with the text file from the page source text from this
    Code:
     https://www.youtube.com/watch?v=rM-CtC6cklI&list=UULFwInqvNXb-GN0JHdtoul_9A    '  --  main play list link
    ,give links of this form
    https://http://www.youtube.com/watch?v=rM-Ct...oul_9A&index=1
    https://http://www.youtube.com/watch?v=YsnmN...oul_9A&index=2
    https://http://www.youtube.com/watch?v=KIx_8...oul_9A&index=3

    …….. up to about &index=79
    If you want the next chunk of videos, and a new text file of it all, you have to click on a video towards the bottom. ( https://i.postimg.cc/65L3ydNF/Click-...t-next-lot.jpg ) I thought I would keep stuff in some organised order, so tried getting all the text in a text file from these 9 links, the ones ending with &index=1, &index=76, &index=151, &index=226 …. 301, 376,451,526,601
    That sort of worked…. Eventually…
    _ I end up with 9 big text files to play with So that is sort of Part 1. I got now all the info I need, somewhere I expect, in those files… https://i.postimg.cc/R06JWCxf/9-Big-...text-files.jpg


    WieGehtsYouTubeServerChrome1.txt https://app.box.com/s/0r4fsgn31gjtzoep22e31378m719znh7
    WieGehtsYouTubeServerChrome76.txt https://app.box.com/s/c2y7978m1o4qqzeia15vaz2ry6jygndo
    WieGehtsYouTubeServerChrome151.txt https://app.box.com/s/aj1a0gdg45lhwu24nsykihz3ln3opj2z
    WieGehtsYouTubeServerChrome226.txt https://app.box.com/s/or5vbv6abv2zb8mtnsz5z54u895fgn7e
    WieGehtsYouTubeServerChrome301.txt https://app.box.com/s/j0cry0vh93w17g5m2mjtzvg0dcvb1437
    WieGehtsYouTubeServerChrome376.txt https://app.box.com/s/d62s25tmv1mdfvyhxhxvcnvxf8bkde3q
    WieGehtsYouTubeServerChrome451.txt https://app.box.com/s/uxt1secic6beh8ejh22g79pzpj61qox6
    WieGehtsYouTubeServerChrome526.txt https://app.box.com/s/h5vakr7abi0r3edzhjdkrdcecanfxoh2
    WieGehtsYouTubeServerChrome601.txt https://app.box.com/s/eftpuaxfnl8nrsvt6xbo0vn3n4klv2wb



    _ a small snag: Previously using the main link, https://http://www.youtube.com/watch?v=rM-Ct...-GN0JHdtoul_9A , gets the first 79 links and with the index number, which is not essential but useful to have. But use a link with the extra &index=123 and I can’t find or get the index number from those 9 text files. Could be hidden there somewhere. I can’t see it initially. Maybe later.
    No matter, not so important
    _ ( I am actually using initially a hybrid Yasser/ SpeakEasy suggestion code to get those. So
    Object "MSXML2.ServerXMLHTTP"
    and the
    .setRequestHeader "User-Agent", "Chrome".
    Maybe that’s a sort of “belt and braces” approach? I don’t know. I have not had the time to look in great detail at the differences yet in the three files. The hybrid comes out the smallest of the three.
    ( https://i.postimg.cc/MK5Q4rYc/Hybrid...-text-file.jpg ) )



    Coding to get those 9 text files
    Code:
    Sub WieGehtsYouTubeURLServerChromeHybridStep75()   '     https://eileenslounge.com/viewtopic.php?p=303644#p303644   https://excelfox.com/forum/showthread.php/2656-Automated-Search-Results-Returning-Nothing            https://excelfox.com/forum/showthread.php/973-Lookup-First-URL-From-Google-Search-Result-Using-VBA
     On Error GoTo Bed
        '_1 First section get the long text string of the HTML coding of the internet Page
        '_1(i) get the long single text string
            With CreateObject("MSXML2.ServerXMLHTTP")
             '                                                      .Open "GET", "https://www.youtube.com/watch?v=rM-CtC6cklI&list=UULFwInqvNXb-GN0JHdtoul_9A", False ' 'just preparing the request type, how and what type... "The True/False argument of the HTTP Request is the Asynchronous mode flag. If set False then control is immediately returns to VBA after Send is executed. If set True then control is returned to VBA after the server has sent back a response.
             '.Open "GET", "https://www.youtube.com/watch?v=4vcAvCLMyUY&list=UULFwInqvNXb-GN0JHdtoul_9A&index=1", False ' '
             '.Open "GET", "https://www.youtube.com/watch?v=NVaMcQcWLKc&list=UULFwInqvNXb-GN0JHdtoul_9A&index=76", False ' '
             '.Open "GET", "https://www.youtube.com/watch?v=8a0nYGk_DkE&list=UULFwInqvNXb-GN0JHdtoul_9A&index=151", False ' '
             '.Open "GET", "https://www.youtube.com/watch?v=4VreecmIQOY&list=UULFwInqvNXb-GN0JHdtoul_9A&index=226", False ' '
             '.Open "GET", "https://www.youtube.com/watch?v=WDCmlmylNm8&list=UULFwInqvNXb-GN0JHdtoul_9A&index=301", False ' '
             '.Open "GET", "https://www.youtube.com/watch?v=Pr2sS5p0wcE&list=UULFwInqvNXb-GN0JHdtoul_9A&index=376", False ' '
             '.Open "GET", "https://www.youtube.com/watch?v=ppJI61RNY0M&list=UULFwInqvNXb-GN0JHdtoul_9A&index=451", False ' '
             '.Open "GET", "https://www.youtube.com/watch?v=RgMdq3uQNuM&list=UULFwInqvNXb-GN0JHdtoul_9A&index=526", False ' '
             .Open "GET", "https://www.youtube.com/watch?v=YofVQq3VngI&list=UULFwInqvNXb-GN0JHdtoul_9A&index=601", False ' '
             '.Open "GET", "", False ' '
             '.Open "GET", "", False ' '
             '.Open "GET", "", False ' '
             'No extra info here for type GET
             '.setRequestHeader bstrheader:="Ploppy", bstrvalue:="PooH" ' YOU MAY NEED TO TAKE OUT THIS LINE
                                                                                        '.setRequestHeader bstrheader:="If-Modified-Since", bstrvalue:="Sat, 1 Jan 2000 00:00:00 GMT" '  https://www.autohotkey.com/boards/viewtopic.php?t=9554  ---   It will caching the contents of the URL page. Which means if you request the same URL more than once, you always get the same responseText even the website changes text every time. This line is a workaround : Set cache related headers.
             .setRequestHeader "User-Agent", "Chrome"  '  https://eileenslounge.com/viewtopic.php?p=303639#p303639
             .send ' varBody:= ' No extra info for type GET. .send actually makes the request
                While .readyState <> 4: DoEvents: Wend ' Allow other processes to run while the web page loads. Think this is part of the True option
            Dim PageSrc As String: Let PageSrc = .responseText ' Save the HTML code in the (Global) variable. ': Range("P1").Value = PageSrc 'For me for a print out copy to text file etc.    The responseText property returns the information requested by the Open method as a text string
            End With
        '_1(ii)  Optional secion  to put the text string into a text file , for ease of code developments
        Dim FileNum2 As Long: Let FileNum2 = FreeFile(0)                                  ' https://msdn.microsoft.com/en-us/vba/language-reference-vba/articles/freefile-function
        Dim PathAndFileName2 As String
         Let PathAndFileName2 = ThisWorkbook.Path & "\" & "WieGehtsYouTubeServerChrome601" & ".txt" ' "WieGehtsYouTubeServerChrome526" & ".txt" ' "WieGehtsYouTubeServerChrome451" & ".txt" '  "WieGehtsYouTubeServerChrome376" & ".txt" '  "WieGehtsYouTubeServerChrome301" & ".txt" '  "WieGehtsYouTubeServerChrome226" & ".txt" '  "WieGehtsYouTubeServerChrome151" & ".txt" '  "WieGehtsYouTubeServerChrome76" & ".txt"   '   "WieGehtsYouTubeServerChrome1" & ".txt"   '
        Open PathAndFileName2 For Output As #FileNum2 ' ' The text file will be made if not there, and if it is there and already contains data, then the data will be overwritten
         Print #FileNum2, PageSrc '
         Close #FileNum2
        
    Exit Sub  '  Normal code error in the case of no errors
    Bed:
     MsgBox prompt:=Err.Number & ":  " & Err.Description: Debug.Print Err.Number & ":  " & Err.Description
    End Sub   ' Code end in the case of any error
    '    Dim sTitle As String
    '     Let sTitle = Split(Split(PageSrc, """title"":{""runs"":[{""text"":""")(1), """}]}")(0)
    '
    '    Dim sViews As String
    '     Let sViews = Split(Split(PageSrc, """shortViewCount"":{""simpleText"":""")(1), """}}}")(0)
    
    Last edited by DocAElstein; 01-25-2023 at 02:02 AM.
    ….If you are my competitor, I will try all I can to beat you. But if I do, I will not belittle you. I will Salute you, because without you, I am nothing.
    If you are my enemy, we will try to kick the fucking shit out of you…..
    Winston Churchill, 1939
    Save your Forum..._
    _...KILL A MODERATOR!!

  3. #93
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,303
    Rep Power
    10
    Part 2
    Get the 4vcAvCLMyUY type bit for use in like https://www.youtube.com/watch?v=4vcAvCLMyUY

    I decided to get out all 11 digit unique YouTube bits ( like WDCmlmylNm8 ) you have in a typical YouTube video link, like https://www.youtube.com/watch?v=WDCmlmylNm8
    I find by inspection that there seems to be all these 11 digit unique YouTube bits, (sometimes duplicated**) in some text ending with like hqdefault.jpg. So that is what is looked for, then a bit of text manipulation is done to pick out the 11 digit unique YouTube bit

    I used a macro like this next one, to get all the 11 digit bits from the 9 files got in the last post.
    Code:
    Sub GetPlayListLinksFromTextFileUsinghqdefaultjpg()  '   look fo this - hqdefault.jpg
    Rem 0
    Dim Ws1 As Worksheet: ' Set Ws1 = ThisWorkbook.Worksheets.Item(1)
    Set Ws1 = ThisWorkbook.Worksheets.Item("RoughNotes")
    ' Rem 1 Get the text file as a long single string
    Dim FileNum As Long: Let FileNum = FreeFile(1)                                    ' https://msdn.microsoft.com/en-us/vba/language-reference-vba/articles/freefile-function
    Dim PathAndFileName As String, TotalFile As String
     '                                             Let PathAndFileName = ThisWorkbook.Path & Application.PathSeparator & "WieGehtsYouTubeServerChrome.txt"   '
     Let PathAndFileName = ThisWorkbook.Path & Application.PathSeparator & "WieGehtsYouTubeServerChrome601.txt" ' "WieGehtsYouTubeServerChrome526.txt" ' "WieGehtsYouTubeServerChrome451.txt" ' "WieGehtsYouTubeServerChrome376.txt" ' "WieGehtsYouTubeServerChrome301.txt" ' "WieGehtsYouTubeServerChrome226.txt" ' "WieGehtsYouTubeServerChrome151.txt" '  "WieGehtsYouTubeServerChrome76.txt" '  '"WieGehtsYouTubeServerChrome1.txt"   '
    Open PathAndFileName For Binary As #FileNum 'Open Route to data. Binary is a fundamental type data input...
    ' Let TotalFile = Space(LOF(FileNum)) '....and wot receives it has to be a string of exactly the right length
    'Get #FileNum, , TotalFile
    '  Or  http://www.eileenslounge.com/viewtopic.php?p=295782&sid=f6dcab07c4d24e00e697fe4343dc7392#p295782
     Let TotalFile = Input(LOF(FileNum), FileNum)
    Close #FileNum
    ' Rem 2 Get all links based on the unique bit of index=x, where x is 1 2 3 4 ...... etc
    Dim Cnt As Long: Let Cnt = 1
    Dim TextBit As String: Let TextBit = TotalFile
    Dim posJpg As Long: Let posJpg = InStr(1, TextBit, "hqdefault.jpg", vbBinaryCompare)
        Do While posJpg <> 0
        Dim strURL As String: Let strURL = Mid(TextBit, posJpg - 12, 11)
        Dim Unics As String
            If InStr(1, Unics, strURL, vbBinaryCompare) = 0 Then
             Let Unics = Unics & " " & strURL
             Dim Lr1 As Long: Let Lr1 = Ws1.Range("B" & Ws1.Rows.Count & "").End(xlUp).Row
             Dim Nr As Long
                If Ws1.Range("B1").Value = "" Then
                 Let Nr = 1
                Else
                 Let Nr = Lr1 + 1
                End If
             Let Ws1.Range("B" & Nr & "").Value = strURL
                
            Else ' Got a dup
             Let Ws1.Range("C" & Nr & "").Value = Ws1.Range("C" & Nr & "").Value + 1 ' for count of dups
            End If
         Let TextBit = Mid(TextBit, posJpg + 1)
         Let posJpg = InStr(1, TextBit, "hqdefault.jpg", vbBinaryCompare)
        Loop
    End Sub
    That code above checks for duplicates of that 11 digit bit in each text file**, but there may be duplicates for the list from each text file as I click a bit above the last link, say on 76th link ,rather than the typical last 79th link, ( if I clicked on the 79th link I would get the unique 11 digit bit for the 79th video twice from the first two text files, by clicking on the 76th link I will get in both the first two text files the 11 digit bit for the 76th 77th 78th and 79th )

    That outputs to a spare sheet.
    For now I manually copy that output 9 times and stick it all in column A of my main file, WieGehtsYouTube.xls ( https://app.box.com/s/97fnm2hhhbiwcnz4nte700pp9sqy79uy ).

    (There are a few extra videos that seems to be advertisements or some video he recommends from someone else. Doesn’t matter – its obvious usually from the title wots wot. I also have the duplicates mentioned, but I take them out at the start of the next macro
    I do it all like this for no special reason – its just the way it came out the first time as I went along. )
    Last edited by DocAElstein; 02-25-2023 at 02:42 PM.

  4. #94
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,303
    Rep Power
    10
    Part 3 How I got all the info I wanted
    Main final macro

    This went easier and smother than I thought it would.
    And The final spreadsheet interaction coding isn’t that slow, - it’s still speed of light compared with doing it all manually, as I was. It’s actually nice to watch the spreadsheet filling up. Its fun when you think of the days of boring manual copying and pasting its saving and you get an initial check that the data looks sensible.
    I don’t want to do this a thousand times a day for a year, - more like a few times a day for a couple of weeks. So I might stay with the slower novice code, - it’s easier to check and change. (It’s a bit cold though. I might put some clothes on. I don’t need to view this in my default skin).
    There is not much point in explaining in detail how I manipulated the text file to get all the information I wanted. I expect if I did it a dozen times , forgetting every time how I did it the last time, then I would end up with as many different solutions. Just a matter of messing with string manipulation.

    Code in the next post

    One thing I did find nice is that Split Split stuff from Yasser. Maybe lots of people know about and use that. I saw it for the first time and it’s a very nice way to get a working coding to get stuff out of a big text file. Like…

    jdhAJ Ex I want this Zed llmbldsm
    So split by Ex , take second array element (1) from that
    , then split the result by Zed and take the first element (0) of that
    Simple but nice- I had always previously done some Instr Left Right Mid stuff before
    So you can have a nice Pretty one line to start with,
    Split(Split(PageSource, " ")(1), " ")(0)
    Find what you are looking for, then drop in a bit of the stuff either side
    = Split(Split(PageSource, " Ex ")(1), " Zed ")(0)

    Sounds like a good one for a “YouTube short”
    Last edited by DocAElstein; 01-25-2023 at 01:13 AM.
    ….If you are my competitor, I will try all I can to beat you. But if I do, I will not belittle you. I will Salute you, because without you, I am nothing.
    If you are my enemy, we will try to kick the fucking shit out of you…..
    Winston Churchill, 1939
    Save your Forum..._
    _...KILL A MODERATOR!!

  5. #95
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,303
    Rep Power
    10

    First main working coding attempt

    First main working coding attempt, explanation in 'comments and last post


    Code:
    
    Sub GetStuffFrom11DigitYouTube()
    Rem 0 Worksheets info
    Dim WsYT11 As Worksheet: Set WsYT11 = ThisWorkbook.Worksheets("ElevenDigitYT")
    Dim RngWsYT11 As Range: Set RngWsYT11 = WsYT11.Range("A1:A1008")
    Dim Unics As String
    Dim Cnt As Long
        For Cnt = 2 To 1009
            If InStr(1, Unics, RngWsYT11.Item(Cnt).Value2, vbBinaryCompare) = 0 Then ' Check to see if I not got this the 11 digit bit yet .....   but there may be duplicates for the list from each text file as I click a bit above the last link, say on 76th link ,rather than the typical last 79th link, ( if I clicked on the 79th link I would get the unique 11 digit bit for the 79th video twice from the first two text files, by clicking on the 76th link I will get in both the first two text files the 11 digit bit for the 76th 77th 78th and 79th )
             Let RngWsYT11.Item(Cnt).Offset(0, 1).Value2 = RngWsYT11.Item(Cnt).Value2 ' Puts the 11 digit bit in column B if i have not had that 11 digit bit yet
             Let Unics = Unics & RngWsYT11.Item(Cnt).Value2 & " " ' Put the 11 digit bit in a string which i check to see if i got this one already
            Else
             ' already got this 11 digit bit, so leave the row empty
            End If
        Next Cnt
    Rem 1 The main stuff now - take every 11 digit bit , make a full link of it, scrape all the text and do all the business
        For Cnt = 2 To 1009
            If RngWsYT11.Item(Cnt).Offset(0, 1).Value2 <> "" Then
                With CreateObject("MSXML2.ServerXMLHTTP")
                 .Open "GET", "https://www.youtube.com/watch?v=" & RngWsYT11.Item(Cnt).Offset(0, 1).Value2, False ' '
                 .setRequestHeader "User-Agent", "Chrome"  '  https://eileenslounge.com/viewtopic.php?p=303639#p303639
                 .send ' varBody:= ' No extra info for type GET. .send actually makes the request
                    While .readyState <> 4: DoEvents: Wend ' Allow other processes to run while the web page loads. Think this is part of the True option
                Dim PageSrc As String: Let PageSrc = .responseText ' Save the HTML code in the (Global) variable. ': Range("P1").Value = PageSrc 'For me for a print out copy to text file etc.    The responseText property returns the information requested by the Open method as a text string
                End With
                '_1(ii)  Optional secion  to put the text string into a text file , for ease of code developments and debugging
            Dim FileNum2 As Long: Let FileNum2 = FreeFile(0)                                  ' https://msdn.microsoft.com/en-us/vba/language-reference-vba/articles/freefile-function
            Dim PathAndFileName2 As String
             Let PathAndFileName2 = ThisWorkbook.Path & "\" & "WieGehtsYouTubeServerChrome" & RngWsYT11.Item(Cnt).Offset(0, 1).Value2 & ".txt" ' "WieGehtsYouTubeServerChrome526" & ".txt" ' "WieGehtsYouTubeServerChrome451" & ".txt" '  "WieGehtsYouTubeServerChrome376" & ".txt" '  "WieGehtsYouTubeServerChrome301" & ".txt" '  "WieGehtsYouTubeServerChrome226" & ".txt" '  "WieGehtsYouTubeServerChrome151" & ".txt" '  "WieGehtsYouTubeServerChrome76" & ".txt"   '   "WieGehtsYouTubeServerChrome1" & ".txt"   '
            Open PathAndFileName2 For Output As #FileNum2 ' ' The text file will be made if not there, and if it is there and already contains data, then the data will be overwritten
             Print #FileNum2, PageSrc '
             Close #FileNum2
     
    ' If RngWsYT11.Item(Cnt).Offset(0, 1).Value2 = "GWGzz5p5D80" Then Stop In case I want to stop at a particular 11 digit bit for debugging
            Dim TextBit As String ' The idea is to get a chunk of text that seems to have all the info I want
             Let TextBit = Split(PageSrc, "videoDescriptionHeaderRenderer""", 2, vbBinaryCompare)(1)
             'Let TextBit = Mid(TextBit, 1, 600)
             Let TextBit = Mid(TextBit, 1, 1400)
            
            Dim Title As String
    '         Let Title = Split(TextBit, ":{""title"":{""runs"":[{""text"":""", 2, vbBinaryCompare)(1)
    '         'Let Title = Split(Title, """}]},""channel""", 2, vbBinaryCompare)(0)
    '         Let Title = Split(Title, """}", 2, vbBinaryCompare)(0)
    '            If InStr(1, TextBit, """,""navigationEndpoint""", vbBinaryCompare) <> 0 Then
    '            Let Title = Split(TextBit, ":{""title"":{""runs"":[{""text"":""", 2, vbBinaryCompare)(1)
    '            Dim Pos1nav As Long, Pos2nav As Long
    '             Let Pos1nav = InStr(1, Title, """,""navigationEndpoint""", vbBinaryCompare)
    '             Let Pos2nav = InStr(Pos1nav, Title, "{""text"":""", vbBinaryCompare)
    '             Let Title = Left(Title, Pos1nav - 1) & Right(Title, Len(Title) - (Pos2nav + 8))
    '             Let Title = Split(Title, """}", 2, vbBinaryCompare)(0)
    '            Else
    '            End If
    '  That above checked for link text crap, but only worked if the link was at the beginning of the title: If the link was inside it or at the end, we will end up with multiple text bits, (and the link text crap)  I need to join together and ignore link text crap  -  BTW "link text crap" is what comes after the text seen for the link which is needed to make it work, but i don't want it        .... Text1 .. Textseenforlink .. textcrapneededtomakelinkwork  ... Text3    etc
            Dim posTxtTag As Long ' Usually there will be just one bit of   {"text":"    after which come the title, but sometimes there may be a few as in the case of a link in the title..
             Let posTxtTag = InStr(1, TextBit, "{""text"":""", vbBinaryCompare)
                Do While posTxtTag <> 0 And posTxtTag < InStr(1, TextBit, """channel""", vbBinaryCompare) ' hopefully the text    channel  always comes after the title and before anything else I want
                Dim posTxtTagEnd As Long, posTxtTagEnd2 As Long
                 Let posTxtTagEnd = InStr(posTxtTag, TextBit, """,""navigationEndpoint", vbBinaryCompare)  ' This gives me the end of the text for the case of a link text
                    If posTxtTagEnd = 0 Then Let posTxtTagEnd = 999 ' If I ain't got a link then I make this big so it does not get used
                 Let posTxtTagEnd2 = InStr(posTxtTag, TextBit, """}", vbBinaryCompare)  ' mostly this would be the end of a text bit
                 Let posTxtTagEnd = Application.WorksheetFunction.Min(posTxtTagEnd, posTxtTagEnd2) ' Whatever text end I have or watever text end comes first will be used
                 Let Title = Title & Mid(TextBit, posTxtTag + 9, (posTxtTagEnd - (posTxtTag + 9))) ' mostly this would only be done once, just more times for building the text if its split into bits by the use of a link in the title
                 Let posTxtTag = InStr(posTxtTagEnd, TextBit, "{""text"":""", vbBinaryCompare) ' This will mostly be 0, unless text is split into bits by the use of a link in the title
                Loop ' While posTxtTag <> 0
             Let RngWsYT11.Item(Cnt).Offset(0, 2).Value2 = Title
             Let Title = "" ' If i don't do this the  Title = Title &   coding will keep adding the titles together
             RngWsYT11.Item(Cnt).Offset(0, 2).Select ' This is helpful or else I cant see the worksheet being filled after the first few
            ' To get the next info i use a lot the  Split Split  bit Yasser showed me   https://eileenslounge.com/viewtopic.php?p=303638#p303638
            Dim Views As String
             Let Views = Split(TextBit, """},""views"":{""simpleText"":""", 2, vbBinaryCompare)(1)
             Let Views = Split(Views, " Aufrufe""}", 2, vbBinaryCompare)(0)
             Let Views = Replace(Views, ".", "", 1, -1, vbBinaryCompare) ' I don't like seperators of any kind
             Let RngWsYT11.Item(Cnt).Offset(0, 4).Value2 = Views
            Dim PubDate As String
             Let PubDate = Split(TextBit, "publishDate"":{""simpleText"":""", 2, vbBinaryCompare)(1)
             Let PubDate = Split(PubDate, """},""factoid"":[{", 2, vbBinaryCompare)(0)
             Let RngWsYT11.Item(Cnt).Offset(0, 3).Value2 = PubDate
            ' date nightmares again
            Dim PubDateV2 As Long, Naw As Long ' DateSerial(year, month, day)
            Dim PubDateRaw As String: Let PubDateRaw = Replace(PubDate, "Premiere am ", "", 1, 1, vbBinaryCompare)
             Let PubDateV2 = DateSerial(Right(PubDateRaw, 4), Mid(PubDateRaw, 4, 2), Left(PubDateRaw, 2))
             Let Naw = Evaluate("=NOW()"): ' Debug.Print Naw
    '         Debug.Print Format(PubDateV2, "\Da\y i\s " & "dddd")
    '         Debug.Print Format(PubDateV2, "dddd DD mmm YYYY") & " " & Format(Naw, "dddd DD mmm YYYY")
             Let RngWsYT11.Item(Cnt).Offset(0, 3).Value2 = PubDateV2
             Let RngWsYT11.Item(Cnt).Offset(0, 3).Value = PubDate
             Let RngWsYT11.Item(Cnt).Offset(0, 3).Value = Format(PubDateV2, "dddd DD mmm YYYY")
             Let RngWsYT11.Item(Cnt).Offset(0, 5).Value = Int(Views / (Naw - PubDateV2))
            Dim Likeses As String
             Let Likeses = Split(TextBit, "Mag ich\""-Bewertungen""},""accessibilityText"":""", 2, vbBinaryCompare)(1)
             Let Likeses = Split(Likeses, " \""Mag ich\""-Bewertungen", 2, vbBinaryCompare)(0)
                If InStr(1, Likeses, "Keine", vbBinaryCompare) = 0 Then ' Sometimes the likes are not there  or not showmn or something - only happend in a video not from the main author
                 'Let RngWsYT11.Item(Cnt).Offset(0, 6).Value = Likeses
                 Let RngWsYT11.Item(Cnt).Offset(0, 7).Value = Int(Likeses / (Naw - PubDateV2))
                Else
                 Let Likeses = "Keine"
                End If
             Let RngWsYT11.Item(Cnt).Offset(0, 6).Value = Likeses
            RngWsYT11.Parent.Columns("A:H").AutoFit
            Else
            End If
        Next Cnt
    End Sub
    Last edited by DocAElstein; 01-25-2023 at 01:53 AM.

  6. #96
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,303
    Rep Power
    10
    https://excelfox.com/forum/showthrea...ge10#post19703
    https://excelfox.com/forum/showthrea...ll=1#post19703 ???







    Here is another go to update and take advantage of things learnt
    ….&index=1, &index=76, &index=151, &index=226 …. 301, 376,451,526,601
    https://www.youtube.com/watch?v=Cy4_...oul_9A&index=1
    https://www.youtube.com/watch?v=d_RO...ul_9A&index=76
    https://www.youtube.com/watch?v=gKNh...l_9A&index=151
    https://www.youtube.com/watch?v=JzU7...l_9A&index=226
    https://www.youtube.com/watch?v=g_1_...l_9A&index=301
    https://www.youtube.com/watch?v=IAIE...l_9A&index=376
    https://www.youtube.com/watch?v=9u37...l_9A&index=451
    https://www.youtube.com/watch?v=6SDk...l_9A&index=526
    https://www.youtube.com/watch?v=EWr8...l_9A&index=601


    Code:
    Option Explicit
    Sub WieGehtsYouTubeURLServerChromeHybridStep75_2()   '     https://eileenslounge.com/viewtopic.php?p=303644#p303644   https://excelfox.com/forum/showthread.php/2840-Notes-tests-Scrapping-YouTube/page11#post19741  https://excelfox.com/forum/showthread.php/2656-Automated-Search-Results-Returning-Nothing            https://excelfox.com/forum/showthread.php/973-Lookup-First-URL-From-Google-Search-Result-Using-VBA
     On Error GoTo Bed
    '_1 First section get the long text string of the HTML coding of the internet Page
    '_1(i) get the long single text string
    Dim strURLs As String: Let strURLs = "https://www.youtube.com/watch?v=Cy4_zrFja2w&list=UULFwInqvNXb-GN0JHdtoul_9A&index=1" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=d_RO2VIcFYw&list=UULFwInqvNXb-GN0JHdtoul_9A&index=76" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=gKNh43aEw_E&list=UULFwInqvNXb-GN0JHdtoul_9A&index=151" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=JzU7jFWbA6s&list=UULFwInqvNXb-GN0JHdtoul_9A&index=226" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=g_1_saf0E1I&list=UULFwInqvNXb-GN0JHdtoul_9A&index=301" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=IAIESH9vPbk&list=UULFwInqvNXb-GN0JHdtoul_9A&index=376" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=9u372_W07Nw&list=UULFwInqvNXb-GN0JHdtoul_9A&index=451" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=6SDkQ-iMrC4&list=UULFwInqvNXb-GN0JHdtoul_9A&index=526" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=EWr8G0r89k0&list=UULFwInqvNXb-GN0JHdtoul_9A&index=601"
    Dim URLs() As String: Let URLs() = Split(strURLs, vbCr & vbLf, 9, vbBinaryCompare)
    Dim Cnt As Long
        For Cnt = LBound(URLs()) To UBound(URLs())
        Dim strURL As String, Indx As String
         Let strURL = URLs(Cnt)
         Let Indx = Right(strURL, Len(strURL) - InStrRev(strURL, "&", -1, vbBinaryCompare))
         Let Indx = Replace(Indx, "=", "_", 1, 1, vbBinaryCompare)
                With CreateObject("MSXML2.ServerXMLHTTP")
                 .Open "GET", strURL, False ' '
                 '.Open "GET", "", False ' '
                 'No extra info here for type GET
                 '.setRequestHeader bstrheader:="Ploppy", bstrvalue:="PooH" ' YOU MAY NEED TO TAKE OUT THIS LINE
                                                                                            '.setRequestHeader bstrheader:="If-Modified-Since", bstrvalue:="Sat, 1 Jan 2000 00:00:00 GMT" '  https://www.autohotkey.com/boards/viewtopic.php?t=9554  ---   It will caching the contents of the URL page. Which means if you request the same URL more than once, you always get the same responseText even the website changes text every time. This line is a workaround : Set cache related headers.
                 .setRequestHeader "User-Agent", "Chrome"  '  https://eileenslounge.com/viewtopic.php?p=303639#p303639
                 .send ' varBody:= ' No extra info for type GET. .send actually makes the request
                    While .readyState <> 4: DoEvents: Wend ' Allow other processes to run while the web page loads. Think this is part of the True option
                Dim PageSrc As String: Let PageSrc = .responseText ' Save the HTML code in the (Global) variable. ': Range("P1").Value = PageSrc 'For me for a print out copy to text file etc.    The responseText property returns the information requested by the Open method as a text string
                End With
            '_1(ii)  Optional secion  to put the text string into a text file , for ease of code developments
            Dim FileNum2 As Long: Let FileNum2 = FreeFile(0)                                  ' https://msdn.microsoft.com/en-us/vba/language-reference-vba/articles/freefile-function
            Dim PathAndFileName2 As String
             Let PathAndFileName2 = ThisWorkbook.Path & "\" & "Videos(1)\" & "WieGehtsYouTubePopularServerChrome" & Indx & ".txt" ' "WieGehtsYouTubeServerChrome526" & ".txt" ' "WieGehtsYouTubeServerChrome451" & ".txt" '  "WieGehtsYouTubeServerChrome376" & ".txt" '  "WieGehtsYouTubeServerChrome301" & ".txt" '  "WieGehtsYouTubeServerChrome226" & ".txt" '  "WieGehtsYouTubeServerChrome151" & ".txt" '  "WieGehtsYouTubeServerChrome76" & ".txt"   '   "WieGehtsYouTubeServerChrome1" & ".txt"   '
            Open PathAndFileName2 For Output As #FileNum2 ' ' The text file will be made if not there, and if it is there and already contains data, then the data will be overwritten
             Print #FileNum2, PageSrc '
             Close #FileNum2
        Next Cnt
    Exit Sub  '  Normal code error in the case of no errors
    Bed:
     MsgBox Prompt:=Err.Number & ":  " & Err.Description: Debug.Print Err.Number & ":  " & Err.Description
    End Sub   ' Code end in the case of any error

    That last coding got the text files



    The next macro needed little change other than the Folder in which the text files are, and the new worksheet used for this new attempt at getting all info from video playlist

    Code:
    '     https://excelfox.com/forum/showthread.php/2840-Notes-tests-Scrapping-YouTube/page10#post19703
    Sub GetPlayListLinksFromTextFileUsinghqdefaultjpg_VideoPlayList()  '   look for this - hqdefault.jpg
    Rem 0a
    Dim WsPop As Worksheet:  Set WsPop = ThisWorkbook.Worksheets.Item("ElevenDigitYT(1)")
    'Set Ws1 = ThisWorkbook.Worksheets.Item("RoughNotes")
    Rem 0b An Array of all the 9 text files got from the last macro   Sub WieGehtsYouTubeURLServerChromeHybridStep75_2()
    Dim strTxts As String: Let strTxts = "WieGehtsYouTubePopularServerChromeindex_1.txt" & vbCr & vbLf & _
    "WieGehtsYouTubePopularServerChromeindex_76.txt" & vbCr & vbLf & _
    "WieGehtsYouTubePopularServerChromeindex_151.txt" & vbCr & vbLf & _
    "WieGehtsYouTubePopularServerChromeindex_226.txt" & vbCr & vbLf & _
    "WieGehtsYouTubePopularServerChromeindex_301.txt" & vbCr & vbLf & _
    "WieGehtsYouTubePopularServerChromeindex_376.txt" & vbCr & vbLf & _
    "WieGehtsYouTubePopularServerChromeindex_451.txt" & vbCr & vbLf & _
    "WieGehtsYouTubePopularServerChromeindex_526.txt" & vbCr & vbLf & _
    "WieGehtsYouTubePopularServerChromeindex_601.txt"
    Dim Txts() As String: Let Txts() = Split(strTxts, vbCr & vbLf, 9, vbBinaryCompare)
        
    Dim Cnt As Long
        For Cnt = LBound(Txts()) To UBound(Txts())    ' Loop all the text files got from the last macro   Sub WieGehtsYouTubeURLServerChromeHybridStep75_2() ==
        ' Rem 1 Get the text files as a long single string
        Dim FileNum As Long: Let FileNum = FreeFile(1)                                    ' https://msdn.microsoft.com/en-us/vba/language-reference-vba/articles/freefile-function
        Dim PathAndFileName As String, TotalFile As String
         '                                             Let PathAndFileName = ThisWorkbook.Path & Application.PathSeparator & "WieGehtsYouTubeServerChrome.txt"   '
         Let PathAndFileName = ThisWorkbook.Path & Application.PathSeparator & "Videos(1)\" & Txts(Cnt)   ' "WieGehtsYouTubeServerChrome601.txt" ' "WieGehtsYouTubeServerChrome526.txt" ' "WieGehtsYouTubeServerChrome451.txt" ' "WieGehtsYouTubeServerChrome376.txt" ' "WieGehtsYouTubeServerChrome301.txt" ' "WieGehtsYouTubeServerChrome226.txt" ' "WieGehtsYouTubeServerChrome151.txt" '  "WieGehtsYouTubeServerChrome76.txt" '  '"WieGehtsYouTubeServerChrome1.txt"   '
        Open PathAndFileName For Binary As #FileNum 'Open Route to data. Binary is a fundamental type data input...
        ' Let TotalFile = Space(LOF(FileNum)) '....and wot receives it has to be a string of exactly the right length
        'Get #FileNum, , TotalFile
        '  Or  http://www.eileenslounge.com/viewtopic.php?p=295782&sid=f6dcab07c4d24e00e697fe4343dc7392#p295782
         Let TotalFile = Input(LOF(FileNum), FileNum)
        Close #FileNum
        ' Rem 2 Get all links based on the unique bit of index=x, where x is 1 2 3 4 ...... etc
        '                   Dim Cnt As Long: Let Cnt = 1
        Dim TextBit As String: Let TextBit = TotalFile
        Dim posJpg As Long: Let posJpg = InStr(1, TextBit, "hqdefault.jpg", vbBinaryCompare)
            Do While posJpg <> 0
            Dim strURL As String: Let strURL = Mid(TextBit, posJpg - 12, 11)
            Dim Unics As String  '  This is mainly because sometimes the same 11 digit bit  appears a few times in a text file,    But Note that because I dont initialise / reset this then, unlike the previous code done once for each text file,  I will also catch the duplicates caused by me overlapping the URLs that I used, like , example at the start of the second main outer loop, the 76th 77th 78th and 79th would not get added
                If InStr(1, Unics, strURL, vbBinaryCompare) = 0 Then
                 Let Unics = Unics & " " & strURL
                 Dim Lr1 As Long: Let Lr1 = WsPop.Range("A" & WsPop.Rows.Count & "").End(xlUp).Row
                 Dim Nr As Long
    '                If WsPop.Range("B1").Value = "" Then
    '                 Let Nr = 1
    '                Else
                     Let Nr = Lr1 + 1
    '                End If
                 Let WsPop.Range("A" & Nr & "").Value = strURL
                    
                Else ' Got a dup
    '             Let WsPop.Range("C" & Nr & "").Value = WsPop.Range("C" & Nr & "").Value + 1 ' for count of dups
                End If
             Let TextBit = Mid(TextBit, posJpg + 1)
             Let posJpg = InStr(1, TextBit, "hqdefault.jpg", vbBinaryCompare)
            Loop
        Next Cnt   '   ============================
    End Sub
    '_-_____________________________________________________________________________________________________
    
    Last edited by DocAElstein; 02-27-2023 at 07:14 PM.

  7. #97
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,303
    Rep Power
    10

    Main coding to get stuff

    Main Macro
    The most important change I did here was to include extra Title changes mainly those of typical characters that I want to get rid of. This sort of thing
    Code:
        ' Do some empirical text tidying up that I might typically have done in a final video title
          Let Title = Replace(Title, "ä", "ae", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "ü", "ue", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "ö", "oe", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "-", " ", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "#wiegehtyoutube", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "!", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "?", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "ß", "ss", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "€", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, ":", " ", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "#", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "&", " ", 1, -1, vbBinaryCompare) '
          Let Title = Replace(Title, "'", " ", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "‚", " ", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, """", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "“", " ", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "„", " ", 1, -1, vbBinaryCompare)  '    „ajdffak“
          Let Title = Replace(Title, "+", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, ".", "", 1, -1, vbBinaryCompare)
    '     Let Title = Replace(Title, "-", " ", 1, -1, vbBinaryCompare)
    '     Let Title = Replace(Title, "-", " ", 1, -1, vbBinaryCompare)
          Let Title = Application.WorksheetFunction.Trim(Title) ' In case any spaces caused by removing stuff
    
    Here the full current coding:
    Code:
    '   https://excelfox.com/forum/showthread.php/2840-Notes-tests-Scrapping-YouTube/page10#post19704
    Sub GetStuffFrom11DigitYouTube2()  '  https://excelfox.com/forum/showthread.php/2840-Notes-tests-Scrapping-YouTube/page11#post19733
    Rem 0 Worksheets info
    Dim WsYT11 As Worksheet: Set WsYT11 = ThisWorkbook.Worksheets("ElevenDigitYT(1)")
    Dim RngWsYT11 As Range: Set RngWsYT11 = WsYT11.Range("A1:A680") ' 692")
    Dim Cnt As Long
    WsYT11.Activate
    ActiveWindow.Panes(3).Activate ' To get out of top pane and into bottom pane (I have worksheet divided at line 1)  This is so only the bottom pane is scrolled and the first line in pane 1 with the headings in stays there
    Rem 1 The main stuff now - take every 11 digit bit , make a full link of it, scrape all the text and do all the business
        For Cnt = 500 To 680 '  692
        RngWsYT11.Item(Cnt).Select
        'If RngWsYT11.Item(Cnt).Offset(0, 1).Value2 <> "" Then
            With CreateObject("MSXML2.ServerXMLHTTP")  '  .Server - Yasser http://www.eileenslounge.com/viewtopic.php?p=303638#p303638
             .Open "GET", "https://www.youtube.com/watch?v=" & RngWsYT11.Item(Cnt).Value2, False ' '
             .setRequestHeader "User-Agent", "Chrome"  ' - SpeakEasy Mike  https://eileenslounge.com/viewtopic.php?p=303639#p303639
             .send ' varBody:= ' No extra info for type GET. .send actually makes the request
                While .readyState <> 4: DoEvents: Wend ' Allow other processes to run while the web page loads. Think this is part of the True option
            Dim PageSrc As String: Let PageSrc = .responseText ' Save the HTML code in the (Global) variable. ': Range("P1").Value = PageSrc 'For me for a print out copy to text file etc.    The responseText property returns the information requested by the Open method as a text string
            End With
            '_1(ii)  Optional secion  to put the text string into a text file , for ease of code developments and debugging
        Dim FileNum2 As Long: Let FileNum2 = FreeFile(0)                                  ' https://msdn.microsoft.com/en-us/vba/language-reference-vba/articles/freefile-function
        Dim PathAndFileName2 As String
         Let PathAndFileName2 = ThisWorkbook.Path & "\Videos(1)\WieGehtsYouTubeServerChrome" & RngWsYT11.Item(Cnt).Value2 & ".txt" '
        Open PathAndFileName2 For Output As #FileNum2 ' ' The text file will be made if not there, and if it is there and already contains data, then the data will be overwritten
         Print #FileNum2, PageSrc '
         Close #FileNum2
    
    ' If RngWsYT11.Item(Cnt).Offset(0, 1).Value2 = "GWGzz5p5D80" Then Stop In case I want to stop at a particular 11 digit bit for debugging
        Dim TextBit As String ' The idea is to get a chunk of text that seems to have all the info I want
         Let TextBit = Split(PageSrc, "videoDescriptionHeaderRenderer""", 2, vbBinaryCompare)(1)
         'Let TextBit = Mid(TextBit, 1, 600)
         Let TextBit = Mid(TextBit, 1, 1400)
        
        Dim Title As String
    '         Let Title = Split(TextBit, ":{""title"":{""runs"":[{""text"":""", 2, vbBinaryCompare)(1)
    '         'Let Title = Split(Title, """}]},""channel""", 2, vbBinaryCompare)(0)
    '         Let Title = Split(Title, """}", 2, vbBinaryCompare)(0)
    '            If InStr(1, TextBit, """,""navigationEndpoint""", vbBinaryCompare) <> 0 Then
    '            Let Title = Split(TextBit, ":{""title"":{""runs"":[{""text"":""", 2, vbBinaryCompare)(1)
    '            Dim Pos1nav As Long, Pos2nav As Long
    '             Let Pos1nav = InStr(1, Title, """,""navigationEndpoint""", vbBinaryCompare)
    '             Let Pos2nav = InStr(Pos1nav, Title, "{""text"":""", vbBinaryCompare)
    '             Let Title = Left(Title, Pos1nav - 1) & Right(Title, Len(Title) - (Pos2nav + 8))
    '             Let Title = Split(Title, """}", 2, vbBinaryCompare)(0)
    '            Else
    '            End If
    '  That above checked for link text crap, but only worked if the link was at the beginning of the title: If the link was inside it or at the end, we will end up with multiple text bits, (and the link text crap)  I need to join together and ignore link text crap  -  BTW "link text crap" is what comes after the text seen for the link which is needed to make it work, but i don't want it        .... Text1 .. Textseenforlink .. textcrapneededtomakelinkwork  ... Text3    etc
        Dim posTxtTag As Long ' Usually there will be just one bit of   {"text":"    after which come the title, but sometimes there may be a few as in the case of a link in the title..
         Let posTxtTag = InStr(1, TextBit, "{""text"":""", vbBinaryCompare)
            Do While posTxtTag <> 0 And posTxtTag < InStr(1, TextBit, """channel""", vbBinaryCompare) ' hopefully the text    channel  always comes after the title and before anything else I want
            Dim posTxtTagEnd As Long, posTxtTagEnd2 As Long
             Let posTxtTagEnd = InStr(posTxtTag, TextBit, """,""navigationEndpoint", vbBinaryCompare)  ' This gives me the end of the text for the case of a link text
                If posTxtTagEnd = 0 Then Let posTxtTagEnd = 999 ' If I ain't got a link then I make this big so it does not get used
             Let posTxtTagEnd2 = InStr(posTxtTag, TextBit, """}", vbBinaryCompare)  ' mostly this would be the end of a text bit
             Let posTxtTagEnd = Application.WorksheetFunction.Min(posTxtTagEnd, posTxtTagEnd2) ' Whatever text end I have or watever text end comes first will be used
             Let Title = Title & Mid(TextBit, posTxtTag + 9, (posTxtTagEnd - (posTxtTag + 9))) ' mostly this would only be done once, just more times for building the text if its split into bits by the use of a link in the title
             Let posTxtTag = InStr(posTxtTagEnd, TextBit, "{""text"":""", vbBinaryCompare) ' This will mostly be 0, unless text is split into bits by the use of a link in the title
            Loop ' While posTxtTag <> 0
         ' some initially empirically found Title tidying up
          Let Title = Replace(Title, "\u0026", "&", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "\""", " ", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "/", " ", 1, -1, vbBinaryCompare)
        ' Do some empirical text tidying up that I might typically have done in a final video title
          Let Title = Replace(Title, "ä", "ae", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "ü", "ue", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "ö", "oe", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "Ä", "AE", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "Ü", "UE", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "Ö", "OE", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "-", " ", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "#wiegehtyoutube", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "!", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "?", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "ß", "ss", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "€", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, ":", " ", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "#", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "&", " ", 1, -1, vbBinaryCompare) '
          Let Title = Replace(Title, "'", " ", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "‚", " ", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, """", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "“", " ", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "„", " ", 1, -1, vbBinaryCompare)  '    „ajdffak“
          Let Title = Replace(Title, "+", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, ".", "", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "YouTube", "UT", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "[", "(", 1, -1, vbBinaryCompare)
          Let Title = Replace(Title, "]", ")", 1, -1, vbBinaryCompare)
    '     Let Title = Replace(Title, "-", " ", 1, -1, vbBinaryCompare)
    '     Let Title = Replace(Title, "-", " ", 1, -1, vbBinaryCompare)
    '     Let Title = Replace(Title, "-", " ", 1, -1, vbBinaryCompare)
    '     Let Title = Replace(Title, "-", " ", 1, -1, vbBinaryCompare)
    '     Let Title = Replace(Title, "-", " ", 1, -1, vbBinaryCompare)
          Let Title = Application.WorksheetFunction.Trim(Title) ' In case any spaces caused by removing stuff
    
         Let RngWsYT11.Item(Cnt).Offset(0, 2).Value2 = Title
         Let Title = "" ' If i don't do this the  Title = Title &   coding will keep adding the titles together
         RngWsYT11.Item(Cnt).Offset(0, 2).Select ' This is helpful or else I cant see the worksheet being filled after the first few
        ' To get the next info i use a lot the  Split Split  bit Yasser showed me   https://eileenslounge.com/viewtopic.php?p=303638#p303638
        Dim Views As String
         Let Views = Split(TextBit, """},""views"":{""simpleText"":""", 2, vbBinaryCompare)(1)
         Let Views = Split(Views, " Aufrufe""}", 2, vbBinaryCompare)(0)
         Let Views = Replace(Views, ".", "", 1, -1, vbBinaryCompare) ' I don't like seperators of any kind
         Let RngWsYT11.Item(Cnt).Offset(0, 4).Value2 = Views
        Dim PubDate As String
         Let PubDate = Split(TextBit, "publishDate"":{""simpleText"":""", 2, vbBinaryCompare)(1)
         Let PubDate = Split(PubDate, """},""factoid"":[{", 2, vbBinaryCompare)(0)
         Let RngWsYT11.Item(Cnt).Offset(0, 3).Value2 = PubDate
        ' date nightmares again
        Dim PubDateV2 As Long, Naw As Long ' DateSerial(year, month, day)
        Dim PubDateRaw As String: Let PubDateRaw = Replace(PubDate, "Premiere am ", "", 1, 1, vbBinaryCompare)
         Let PubDateRaw = Replace(PubDateRaw, "Live übertragen am ", "", 1, 1, vbBinaryCompare)
         Let PubDateV2 = DateSerial(Right(PubDateRaw, 4), Mid(PubDateRaw, 4, 2), Left(PubDateRaw, 2))
         Let RngWsYT11.Item(Cnt).Offset(0, 1).Value2 = PubDateV2
         Let Naw = Evaluate("=NOW()"): ' Debug.Print Naw
    '         Debug.Print Format(PubDateV2, "\Da\y i\s " & "dddd")
    '         Debug.Print Format(PubDateV2, "dddd DD mmm YYYY") & " " & Format(Naw, "dddd DD mmm YYYY")
         Let RngWsYT11.Item(Cnt).Offset(0, 3).Value2 = PubDateV2
         Let RngWsYT11.Item(Cnt).Offset(0, 3).Value = PubDate
         Let RngWsYT11.Item(Cnt).Offset(0, 3).Value = Format(PubDateV2, "dddd DD mmm YYYY")
         Let RngWsYT11.Item(Cnt).Offset(0, 5).Value = Int(Views / (Naw - PubDateV2))
        Dim Likeses As String
         Let Likeses = Split(TextBit, "Mag ich\""-Bewertungen""},""accessibilityText"":""", 2, vbBinaryCompare)(1)
         Let Likeses = Split(Likeses, " \""Mag ich\""-Bewertungen", 2, vbBinaryCompare)(0)
            If InStr(1, Likeses, "Keine", vbBinaryCompare) = 0 Then ' Sometimes the likes are not there  or not showmn or something - only happend in a video not from the main author
             'Let RngWsYT11.Item(Cnt).Offset(0, 6).Value = Likeses
             Let RngWsYT11.Item(Cnt).Offset(0, 7).Value = Int(Likeses / (Naw - PubDateV2))
            Else
             Let Likeses = "Keine"
            End If
         Let RngWsYT11.Item(Cnt).Offset(0, 6).Value = Likeses
        RngWsYT11.Parent.Columns("A:H").AutoFit
        'Else
        'End If
        Next Cnt
    End Sub
    Last edited by DocAElstein; 02-27-2023 at 08:52 PM.

  8. #98
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,303
    Rep Power
    10

    VBA arrays coding to match up 2 slightly different Titles of same Video in 2 lists


    This is post #98 #post19705
    https://excelfox.com/forum/showthrea...ll=1#post19705 ???
    https://excelfox.com/forum/showthrea...ge10#post19705








    Final video Folder Video order
    I need to do some sorting. Order sorting, for example based on date, and / or add a date “stamp” …

    The problem is that my final modified titles on the actual videos will not tie up perfectly with those got from the scrapping coding. (But I note here that as time goes on the two will get closer, as I tend to apply more and more adjusting at the initial scrapping level , based on the modifications later that I find I need to do )

    I spent a few frustrating days trying to modify an efficient range.Find to do this, but it did not really do it without being very complicated so wasting the otherwise efficient way the range.Find seems to work.

    Finally a simpler VBA array coding seems OK for now, so I need a Simple VBA arrays coding to match up two Title list where the Titles for a particular video may be slightly different in the two lists

    VBA arrays coding to match up 2 slightly different Titles of same Video in 2 lists

    The function takes in
    __
    ( The value ( Title text ) to be looked for , the range (as range object) to be looked in ) returning the found cell as range object

    The basic idea is to split the Title to be looked for by a space so we have a one dimensional array of all the words in that title, and then we try to find a title in the list to be searched that has at least a certain number , HitsWish , of those words. That number is determined empirically , and if that amount of number is not found we keep reducing the HitsWish. So we do our best to match as many words as possible. That way we improve the chances of matching the right one

    Rem 1 This does some tidying of given Title text value to be looked for. This follows along the usual typical tidying up that I do of Titles, but as noted, as time goes on the two titles for the same video will get closer, as I tend to apply more and more adjusting at the initial scrapping level , based on the modifications later that I find I need to do

    Rem 2 is a bit of a customized fiddle thing. This is because there is a chance the first few words would never be found, even later, as I might have had some number or other ordering text added at the start. So I make an adjustment fiddle thing and variable that means I don’t include those words. Its basically based on a quick more conventional Range.Find thing to see if the first word(s) are anywhere in the list. If not then from then on we never try those.

    Rem 3 is the main searching Loop(s). Three of them, in a nested fashion.
    _ The outer loop , loops backwards if necessary to reduce the number of words we try to find in a single title in the list to be searched for a match.
    _ The next Loop inside is looping all the rows in the list of titles to be searched for a match
    _ The inner most loop goes through all (or most depending on what happened in Rem 2 ) of the words from the title to be looked for, and if we reach the required number of hits, HitsWish , then we terminate after passing the found cell to the result of the function, This is the bit that does that…
    Set TitlSrch = SrchClm.Item(Rw)
    Here is the initial final macro, at this stage :

    Code:
    ' This function is a VBA array looping thing. It tidies up a bit a string Title you give it in the typical way i might tidy up a title.  Then it splits that by spaces, and tries to match a lot of words from that in a Title in the given range of Titles. It reduces the amount of words it tries to match until it finds a match or gives up and tells you it never managed it
    Public Function TitlSrch(TrgtVal As String, SrchClm As Range) As Range 'TrgtVal is range selected value, SrchClm is LookUpTable   The return is the found cell range
     Let Application.EnableEvents = True
    Rem 1 Some tidying of given string.
    ' 1a) This follows along the usual typical tyding up that I do of Titles
    Dim SchTxt As String: Let SchTxt = Trim(TrgtVal) ' Trim(Trgt.Value)
        ' Initial check for if multi words if there are spaces
        If InStr(1, SchTxt, " ", vbBinaryCompare) > 0 Then ' Check for more than 1 word to look for==============
        ' Remove all but single space in between words to allow split via a space
         Let SchTxt = Evaluate("=TRIM(SUBSTITUTE(" & """" & SchTxt & """" & ",CHAR(160),CHAR(32)))") ' TRIM function trims the 7-bit ASCII space character (value 32). In the Unicode character set, there is an additional space character called the nonbreaking space character that has a decimal value of 160. This character is commonly used in Web pages as the HTML entity,  . By itself, the TRIM function does not remove this nonbreaking space character.       https://www.excelforum.com/excel-formulas-and-functions/1217202-is-there-a-function-similar-to-trim-but-that-only-removes-trailing-spaces-2.html   Note also that spaces more than 1 are removed from in between text
        ' Do some empirical text tidying up that I might typically have done in a final video title
          Let SchTxt = Replace(SchTxt, "ä", "ae", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "ü", "ue", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "ö", "oe", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "Ä", "AE", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "Ü", "UE", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "Ö", "OE", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "-", " ", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "#wiegehtyoutube", "", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "!", "", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "?", "", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "ß", "ss", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "€", "", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, ":", " ", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "#", "", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "&", " ", 1, -1, vbBinaryCompare) '
          Let SchTxt = Replace(SchTxt, "'", " ", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "‚", " ", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, """", "", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "“", " ", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "„", " ", 1, -1, vbBinaryCompare)  '    „ajdffak“
          Let SchTxt = Replace(SchTxt, "+", "", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, ".", "", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "YouTube", "UT", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "[", "(", 1, -1, vbBinaryCompare)
          Let SchTxt = Replace(SchTxt, "]", ")", 1, -1, vbBinaryCompare)
    '     Let SchTxt = Replace(SchTxt, "-", " ", 1, -1, vbBinaryCompare)
    '     Let SchTxt = Replace(SchTxt, "-", " ", 1, -1, vbBinaryCompare)
    '     Let SchTxt = Replace(SchTxt, "-", " ", 1, -1, vbBinaryCompare)
    '     Let SchTxt = Replace(SchTxt, "-", " ", 1, -1, vbBinaryCompare)
    '     Let SchTxt = Replace(SchTxt, "-", " ", 1, -1, vbBinaryCompare)
    
    '     Let SchTxt = Replace(schtxt, "-", " ", 1, -1, vbBinaryCompare)
    '     Let SchTxt = Replace(schtxt, "-", " ", 1, -1, vbBinaryCompare)
         Let SchTxt = Application.WorksheetFunction.Trim(SchTxt) ' In case any spaces caused by removing stuff, as we still just want one space between for splitting
        
        Dim SchPts() As String: Let SchPts() = VBA.Strings.Split(SchTxt, " ", -1) ' Do split on single space to get multiple words
     '1b) removing of some words, for example if they are very common, or small
        Dim strNew As String
        Dim Cnt As Long
            For Cnt = LBound(SchPts()) To UBound(SchPts())
                If Len(SchPts(Cnt)) < 5 Then
                ' ignore short words
                Else
                    Select Case SchPts(Cnt)
                     Case "YouTube", "Youtube"
                      ' ignore those words
                     Case Else
                      Let strNew = strNew & SchPts(Cnt) & " " ' Building string from any nmot ignored words
                    End Select
                End If
            Next Cnt
         Let strNew = Left(strNew, Len(strNew) - 1) ' Take off last space
            If InStr(1, strNew, " ", vbBinaryCompare) = 0 Then GoTo SS ' go to Look for just one word
         Let SchPts() = VBA.Strings.Split(strNew, " ", -1)
    
        Dim HitsWish As Long: Let HitsWish = 6 ' The most number of words we must find to make it as sucess before trying less words. Set this empirically
    
        Rem 2 checking to find at least one word
        Dim Adj As Long ' This is used to add to array index if first word was not found
            For Cnt = 0 To UBound(SchPts())
            Dim FndCel As Range ' used for result of any search
             Set FndCel = SrchClm.Find(what:=SchPts(0 + Adj), LookIn:=xlFormulas, Lookat:=xlPart, searchorder:=xlByRows, searchdirection:=xlNext, MatchCase:=False)
                If FndCel Is Nothing Then
                 Let Adj = Adj + 1
                    If Adj > UBound(SchPts()) Then MsgBox Prompt:="Cant find any words in " & vbCr & vbLf & """" & TrgtVal & """" & vbCr & vbLf & """" & strNew & """": Debug.Print "Cant find any words in " & vbCr & vbLf & """" & TrgtVal & """" & vbCr & vbLf & """" & strNew & """": GoTo TheEnd
                Else
                 Exit For
                End If
            Next Cnt
            If Adj = UBound(SchPts()) Then Let SchTxt = SchPts(UBound(SchPts())): GoTo SS  ' We only got one word to look for, the last word in our array of words to look for, the only word that was found actually, so going to SS here means I will look for it again there, nevermind
        ' At this point we have at least two words and hopefully at least 4. Normally we would have a total of UBound(SchPts())+1. But this will be reduced by  Adj
    '        If UBound(SchPts()) + 1 - Adj = 6 Then Let HitsWish = 6 ' We only have 6 words to search for
            If UBound(SchPts()) + 1 - Adj = 5 Then Let HitsWish = 5 ' We only have 5 words to search for
            If UBound(SchPts()) + 1 - Adj = 4 Then Let HitsWish = 4 ' We only have 4 words to search for
            If UBound(SchPts()) + 1 - Adj = 3 Then Let HitsWish = 3 ' We only have three words to search for
            If UBound(SchPts()) + 1 - Adj = 2 Then Let HitsWish = 2 ' We only have two words to search for
    
        Rem 3 Check for Hits wanted, or less
        Dim arrSrchClm() As Variant: Let arrSrchClm() = SrchClm.Value
        Dim Hits
            For Hits = HitsWish To 1 Step -1 ' If we dont jump out of the loop, then we reduce the hit cpount goal and try again
            Dim Rw As Long
                For Rw = 1 To UBound(arrSrchClm(), 1)
                Dim CntHit As Long: Let CntHit = 0
                    For Cnt = 0 + Adj To UBound(SchPts())
                        If InStr(1, arrSrchClm(Rw, 1), SchPts(Cnt), vbBinaryCompare) > 0 Then
                         Let CntHit = CntHit + 1
                            If CntHit = Hits Then Debug.Print Hits & " Hits for  """ & SrchClm.Item(Rw).Value & """": Set TitlSrch = SrchClm.Item(Rw): GoTo TheEnd
                        Else
                        End If
                    Next Cnt
                Next Rw
            Next Hits
        Debug.Print "Nothing": Exit Function
    SS: ' Do not put this before the Else idiot!
         Set FndCel = SrchClm.Find(what:=VBA.Strings.Trim(SchTxt), LookIn:=xlFormulas, Lookat:=xlPart, searchorder:=xlByRows, searchdirection:=xlNext, MatchCase:=False) ' I might sometimes be doing this twice. Nevermind
            If Not FndCel Is Nothing Then
             Set TitlSrch = FndCel: Let Hits = 1: GoTo TheEnd
            Else
    '         Selection.Offset(0, 2).Select: GoTo TheEnd '''_- Exit Function ' case no single word match
            End If
        End If ' Finished case single word to look for or multiple words to look for=============================
    TheEnd: '''_- Exit Function
    ' GoToEmptyCellNearby
     Let Application.EnableEvents = True
    
    EndFuk:
    End Function


    In the next post I will add some notes for the first few actual uses of that, ( and may then edit the above function coding a bit as I go along )
    Last edited by DocAElstein; 02-28-2023 at 06:23 PM.

  9. #99
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,303
    Rep Power
    10
    At the moment I have in a file , WieGehtsYouTube.xls , and the worksheet , "ElevenDigitYT(1)" , something close to all the videos from the Video Play list, and a few from the Popular Play list that did not appear in the Video PlayList.

    That should have close to all the actual videos in my main storage, which from a recent Dir thing gives me the Worksheet WMV
    Last edited by DocAElstein; 02-28-2023 at 06:25 PM.

  10. #100
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,303
    Rep Power
    10
    AKJDakjdha
    Last edited by DocAElstein; 01-24-2023 at 09:25 PM.
    ….If you are my competitor, I will try all I can to beat you. But if I do, I will not belittle you. I will Salute you, because without you, I am nothing.
    If you are my enemy, we will try to kick the fucking shit out of you…..
    Winston Churchill, 1939
    Save your Forum..._
    _...KILL A MODERATOR!!

Similar Threads

  1. Tests and Notes on Range Referrencing
    By DocAElstein in forum Test Area
    Replies: 70
    Last Post: 02-20-2024, 01:54 AM
  2. Tests and Notes for EMail Threads
    By DocAElstein in forum Test Area
    Replies: 29
    Last Post: 11-15-2022, 04:39 PM
  3. Some Date Notes and Tests
    By DocAElstein in forum Test Area
    Replies: 0
    Last Post: 11-23-2021, 10:40 PM
  4. Replies: 39
    Last Post: 03-20-2018, 04:09 PM
  5. Notes tests. Excel VBA Folder File Search
    By DocAElstein in forum Test Area
    Replies: 39
    Last Post: 03-20-2018, 04:09 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •