Advertisement
Help Keep Boards Alive. Support us by going ad free today. See here: https://subscriptions.boards.ie/.
If we do not hit our goal we will be forced to close the site.

Current status: https://keepboardsalive.com/

Annual subs are best for most impact. If you are still undecided on going Ad Free - you can also donate using the Paypal Donate option. All contribution helps. Thank you.
https://www.boards.ie/group/1878-subscribers-forum

Private Group for paid up members of Boards.ie. Join the club.

Stata help

  • 20-03-2010 02:16PM
    #1
    Posts: 6,176 ✭✭✭


    Stata fiends, I need help!

    I have a dataset which has mostly null values the ( . ) and I need to run an if statement but I'm not sure what value to put in for the dot.

    Any help?


Comments

  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    You mean blank cells? Doesn't Stata just omit the obs?


  • Posts: 6,176 ✭✭✭ [Deleted User]


    Yeah, but I have a column that is . . . . . . . some number . . . . . . some other number etc

    I want to create a column that will be 0 if the first column is blank, 1 if not


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    I'm slightly confused. You mean:

    I want to create a column that will be 0 if the first row is blank, 1 if not?


  • Posts: 6,176 ✭✭✭ [Deleted User]


    C1 C2
    . 0
    . 0
    . 0
    4 1
    . 0
    . 0
    . 0
    5 1
    . 0

    This is what I want to end up with. C1 exists at the moment


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    I don't think you can get Stata to recognise the null variables unless you give them a non-indeterminate value. What you are asking for kind of reminds me of a companion form matrix, maybe Stata has a similar tool.

    http://en.wikipedia.org/wiki/Companion_matrix


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 8,452 ✭✭✭Time Magazine


    Does

    gen exists = 1
    replace exists = 0 if myvar == .


    not work?


  • Posts: 6,176 ✭✭✭ [Deleted User]


    Nope, that replaces all the entries with 1.
    Stata doesn't seem to know what to do the blanks.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Do you have the XLS dataset? Maybe you could ctrl+H the dots and replace them with something ("null") that Stata will recognise?


  • Registered Users, Registered Users 2 Posts: 8,452 ✭✭✭Time Magazine


    Have you tried my code above but with =="." instead of ==. ?

    As alluded to by the mod of the Boring forum, you could change this with a bit of brute force. Enter the command edit, select and copy all the cells, paste into Excel, "Find and Replace All" in Excel, paste back into Stata. I feel dirty even suggesting it.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Have you tried my code above but with =="." instead of ==. ?

    As alluded to by the mod of the Boring forum, you could change this with a bit of brute force. Enter the command edit, select and copy all the cells, paste into Excel, "Find and Replace All" in Excel, paste back into Stata. I feel dirty even suggesting it.

    Says the mod of the LaTeX forum.

    :p


  • Advertisement
  • Posts: 6,176 ✭✭✭ [Deleted User]


    Dataset is massive and will be growing, automisation all the way!

    I've gotten a workaround but I'll try that tomorrow. Cheers lads!


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    I'm having a little problem. I am importing a dataset into Stata, but when I go to look at the data, it is highlighting one of my columns (GDP) in red, suggesting that it is non-numerical. Indeed, when I attempt to create a log version of this variable, it tells me that it can't. I have looked up and down the column in question, and can see only numbers. No letters, no spaces.

    Any ideas?


  • Posts: 6,176 ✭✭✭ [Deleted User]


    Somthing wrong in the underlying excel file?


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    That's my only inclination, tbh. Is there any way that I can tell Stata to reclassify the column? Out of interest.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Just remade the CSV from the original XLS file, and the same thing occurred. Is Stata know to have issues with Office 2007? I dont think that could be it, since I have used this combo before.


  • Posts: 6,176 ✭✭✭ [Deleted User]


    i use stat-transfer to make it straight into a dta.

    Copy the column in the original file and see if the duplicate works.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Yeah, thats what I tried. I will give that stat transfer thing a go.


  • Registered Users, Registered Users 2 Posts: 8,452 ✭✭✭Time Magazine




  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving



    That's handy. I figured it out last night anyway, turns out that the commas Excel imposes for thousand values was doing it. There is a setting to switch it off.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Ok, I have a country-level dataset and I am creating three continent dummys; one for Asia, Africa and Other (anything not the other two).

    So, when I regress:

    Y = Asia + Africa + X

    no problems. But when I include the 'Other':

    Y = Asia + Africa + Other + X

    Asia gets "dropped". Asia does not get dropped when:

    Y = Asia + Other + X


    Any ideas?


  • Advertisement
  • Posts: 6,176 ✭✭✭ [Deleted User]


    colinearity?


  • Registered Users, Registered Users 2 Posts: 8,452 ✭✭✭Time Magazine


    Ok, I have a country-level dataset and I am creating three continent dummys; one for Asia, Africa and Other (anything not the other two).

    So, when I regress:

    Y = Asia + Africa + X

    no problems. But when I include the 'Other':

    Y = Asia + Africa + Other + X

    Asia gets "dropped". Asia does not get dropped when:

    Y = Asia + Other + X


    Any ideas?

    Zaraba got it. You have perfect multicollinearity because you fell into the "dummy variable trap."

    Think about it, if you have a male dummy in a regression you see the effect of being male. That implies you're working off a base of being female. Including a female dummy is redundant and will actually lead to singularity problems with your X'X matrix, i.e. perfect multicollinearity. The same thing applies to your three continent example.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Ok, I get that. But I am simply emulating a published paper, where they do just as I described. Any suggestions?


  • Posts: 6,176 ✭✭✭ [Deleted User]


    Treat one of the dummies as the constant.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Oh right. Good thinking.


  • Posts: 6,176 ✭✭✭ [Deleted User]


    Lesson 1 in undergrad time series work!
    Thank you Mike Harrison.


  • Registered Users, Registered Users 2 Posts: 8,452 ✭✭✭Time Magazine


    Thank you Mike Harrison.

    mjh.jpg

    Greatest man ever.


Advertisement